data <- read_csv("prosperLoanData.csv") %>%
mutate_if(is.character, as.factor)
Above, I imported the character columns as factors, as having taken a closer look at the data, they are labels for categories, rather than character strings (in the following analysis, I don’t find any disconfirmation of this). The first thing I will do now is take a closer look at the data, and see if other columns are formatted appropriately:
data[,1:7]
## # A tibble: 113,937 x 7
## ListingKey ListingNumber ListingCreationDate CreditGrade Term
## <fct> <int> <dttm> <fct> <int>
## 1 102133976686814541… 193129 2007-08-26 19:09:29 C 36
## 2 10273602499503308B… 1209647 2014-02-27 08:28:07 <NA> 36
## 3 0EE933782585103286… 81716 2007-01-05 15:00:47 HR 36
## 4 0EF535600248271529… 658116 2012-10-22 11:02:35 <NA> 36
## 5 0F023589499656230C… 909464 2013-09-14 18:38:39 <NA> 36
## 6 0F0535973482419938… 1074836 2013-12-14 08:26:37 <NA> 60
## 7 0F0A3576754255009D… 750899 2013-04-12 09:52:56 <NA> 36
## 8 0F1035772717087366… 768193 2013-05-05 06:49:27 <NA> 36
## 9 0F043596202561788E… 1023355 2013-12-02 10:43:39 <NA> 36
## 10 0F043596202561788E… 1023355 2013-12-02 10:43:39 <NA> 36
## # ... with 113,927 more rows, and 2 more variables: LoanStatus <fct>,
## # ClosedDate <dttm>
str(data)
## Classes 'tbl_df', 'tbl' and 'data.frame': 113937 obs. of 81 variables:
## $ ListingKey : Factor w/ 113066 levels "00003546482094282EF90E5",..: 7180 7193 6647 6669 6686 6689 6699 6706 6687 6687 ...
## $ ListingNumber : int 193129 1209647 81716 658116 909464 1074836 750899 768193 1023355 1023355 ...
## $ ListingCreationDate : POSIXct, format: "2007-08-26 19:09:29" "2014-02-27 08:28:07" ...
## $ CreditGrade : Factor w/ 8 levels "A","AA","B","C",..: 4 NA 7 NA NA NA NA NA NA NA ...
## $ Term : int 36 36 36 36 36 60 36 36 36 36 ...
## $ LoanStatus : Factor w/ 12 levels "Cancelled","Chargedoff",..: 3 4 3 4 4 4 4 4 4 4 ...
## $ ClosedDate : POSIXct, format: "2009-08-14" NA ...
## $ BorrowerAPR : num 0.165 0.12 0.283 0.125 0.246 ...
## $ BorrowerRate : num 0.158 0.092 0.275 0.0974 0.2085 ...
## $ LenderYield : num 0.138 0.082 0.24 0.0874 0.1985 ...
## $ EstimatedEffectiveYield : num NA 0.0796 NA 0.0849 0.1832 ...
## $ EstimatedLoss : num NA 0.0249 NA 0.0249 0.0925 ...
## $ EstimatedReturn : num NA 0.0547 NA 0.06 0.0907 ...
## $ ProsperRating (numeric) : int NA 6 NA 6 3 5 2 4 7 7 ...
## $ ProsperRating (Alpha) : Factor w/ 7 levels "A","AA","B","C",..: NA 1 NA 1 5 3 6 4 2 2 ...
## $ ProsperScore : num NA 7 NA 9 4 10 2 4 9 11 ...
## $ ListingCategory (numeric) : int 0 2 0 16 2 1 1 2 7 7 ...
## $ BorrowerState : Factor w/ 51 levels "AK","AL","AR",..: 6 6 11 11 24 33 17 5 15 15 ...
## $ Occupation : Factor w/ 67 levels "Accountant/CPA",..: 36 42 36 51 20 42 49 28 23 23 ...
## $ EmploymentStatus : Factor w/ 8 levels "Employed","Full-time",..: 8 1 3 1 1 1 1 1 1 1 ...
## $ EmploymentStatusDuration : int 2 44 NA 113 44 82 172 103 269 269 ...
## $ IsBorrowerHomeowner : Factor w/ 2 levels "False","True": 2 1 1 2 2 2 1 1 2 2 ...
## $ CurrentlyInGroup : Factor w/ 2 levels "False","True": 2 1 2 1 1 1 1 1 1 1 ...
## $ GroupKey : Factor w/ 706 levels "00343376901312423168731",..: NA NA 334 NA NA NA NA NA NA NA ...
## $ DateCreditPulled : POSIXct, format: "2007-08-26 18:41:46" "2014-02-27 08:28:14" ...
## $ CreditScoreRangeLower : int 640 680 480 800 680 740 680 700 820 820 ...
## $ CreditScoreRangeUpper : int 659 699 499 819 699 759 699 719 839 839 ...
## $ FirstRecordedCreditLine : POSIXct, format: "2001-10-11" "1996-03-18" ...
## $ CurrentCreditLines : int 5 14 NA 5 19 21 10 6 17 17 ...
## $ OpenCreditLines : int 4 14 NA 5 19 17 7 6 16 16 ...
## $ TotalCreditLinespast7years : int 12 29 3 29 49 49 20 10 32 32 ...
## $ OpenRevolvingAccounts : int 1 13 0 7 6 13 6 5 12 12 ...
## $ OpenRevolvingMonthlyPayment : num 24 389 0 115 220 1410 214 101 219 219 ...
## $ InquiriesLast6Months : int 3 3 0 0 1 0 0 3 1 1 ...
## $ TotalInquiries : num 3 5 1 1 9 2 0 16 6 6 ...
## $ CurrentDelinquencies : int 2 0 1 4 0 0 0 0 0 0 ...
## $ AmountDelinquent : num 472 0 NA 10056 0 ...
## $ DelinquenciesLast7Years : int 4 0 0 14 0 0 0 0 0 0 ...
## $ PublicRecordsLast10Years : int 0 1 0 0 0 0 0 1 0 0 ...
## $ PublicRecordsLast12Months : int 0 0 NA 0 0 0 0 0 0 0 ...
## $ RevolvingCreditBalance : num 0 3989 NA 1444 6193 ...
## $ BankcardUtilization : num 0 0.21 NA 0.04 0.81 0.39 0.72 0.13 0.11 0.11 ...
## $ AvailableBankcardCredit : num 1500 10266 NA 30754 695 ...
## $ TotalTrades : num 11 29 NA 26 39 47 16 10 29 29 ...
## $ TradesNeverDelinquent (percentage) : num 0.81 1 NA 0.76 0.95 1 0.68 0.8 1 1 ...
## $ TradesOpenedLast6Months : num 0 2 NA 0 2 0 0 0 1 1 ...
## $ DebtToIncomeRatio : num 0.17 0.18 0.06 0.15 0.26 0.36 0.27 0.24 0.25 0.25 ...
## $ IncomeRange : Factor w/ 8 levels "$0","$1-24,999",..: 4 5 7 4 3 3 4 4 4 4 ...
## $ IncomeVerifiable : Factor w/ 2 levels "False","True": 2 2 2 2 2 2 2 2 2 2 ...
## $ StatedMonthlyIncome : num 3083 6125 2083 2875 9583 ...
## $ LoanKey : Factor w/ 113066 levels "00003683605746079487FF7",..: 100337 69837 46303 70776 71387 86505 91250 5425 908 908 ...
## $ TotalProsperLoans : int NA NA NA NA 1 NA NA NA NA NA ...
## $ TotalProsperPaymentsBilled : int NA NA NA NA 11 NA NA NA NA NA ...
## $ OnTimeProsperPayments : int NA NA NA NA 11 NA NA NA NA NA ...
## $ ProsperPaymentsLessThanOneMonthLate: int NA NA NA NA 0 NA NA NA NA NA ...
## $ ProsperPaymentsOneMonthPlusLate : int NA NA NA NA 0 NA NA NA NA NA ...
## $ ProsperPrincipalBorrowed : num NA NA NA NA 11000 NA NA NA NA NA ...
## $ ProsperPrincipalOutstanding : num NA NA NA NA 9948 ...
## $ ScorexChangeAtTimeOfListing : int NA NA NA NA NA NA NA NA NA NA ...
## $ LoanCurrentDaysDelinquent : int 0 0 0 0 0 0 0 0 0 0 ...
## $ LoanFirstDefaultedCycleNumber : int NA NA NA NA NA NA NA NA NA NA ...
## $ LoanMonthsSinceOrigination : int 78 0 86 16 6 3 11 10 3 3 ...
## $ LoanNumber : int 19141 134815 6466 77296 102670 123257 88353 90051 121268 121268 ...
## $ LoanOriginalAmount : int 9425 10000 3001 10000 15000 15000 3000 10000 10000 10000 ...
## $ LoanOriginationDate : POSIXct, format: "2007-09-12" "2014-03-03" ...
## $ LoanOriginationQuarter : Factor w/ 33 levels "Q1 2006","Q1 2007",..: 18 8 2 32 24 33 16 16 33 33 ...
## $ MemberKey : Factor w/ 90831 levels "00003397697413387CAF966",..: 11071 10302 33781 54939 19465 48037 60448 40951 26129 26129 ...
## $ MonthlyLoanPayment : num 330 319 123 321 564 ...
## $ LP_CustomerPayments : num 11396 0 4187 5143 2820 ...
## $ LP_CustomerPrincipalPayments : num 9425 0 3001 4091 1563 ...
## $ LP_InterestandFees : num 1971 0 1186 1052 1257 ...
## $ LP_ServiceFees : num -133.2 0 -24.2 -108 -60.3 ...
## $ LP_CollectionFees : num 0 0 0 0 0 0 0 0 0 0 ...
## $ LP_GrossPrincipalLoss : num 0 0 0 0 0 0 0 0 0 0 ...
## $ LP_NetPrincipalLoss : num 0 0 0 0 0 0 0 0 0 0 ...
## $ LP_NonPrincipalRecoverypayments : num 0 0 0 0 0 0 0 0 0 0 ...
## $ PercentFunded : num 1 1 1 1 1 1 1 1 1 1 ...
## $ Recommendations : int 0 0 0 0 0 0 0 0 0 0 ...
## $ InvestmentFromFriendsCount : int 0 0 0 0 0 0 0 0 0 0 ...
## $ InvestmentFromFriendsAmount : num 0 0 0 0 0 0 0 0 0 0 ...
## $ Investors : int 258 1 41 158 20 1 1 1 1 1 ...
The first thing I notice is that there are several date columns which should be formatted as such, and several boolean (True/False) type columns. ListingCategory.num is actually a category label, not a numeric measure, unfortunately without a key to disambiguate the labels. I also want to order the levels in some of the factor columns, as they are inherently ordered (CreditGrade, ProsperRating.alpha, IncomeRange, LoanOriginationQuarter). Several of the columns have spaces or special characters in the column names, which makes it difficult to refer to these columns - I will rename these.
# changes variable types and names, where appropriate
data %<>%
mutate_at(c("ListingCreationDate","ClosedDate","DateCreditPulled","FirstRecordedCreditLine","LoanOriginationDate"), as.Date) %>%
mutate_at(c("IsBorrowerHomeowner","CurrentlyInGroup","IncomeVerifiable"), as.logical) %>%
rename_all(~sub(" (numeric)", ".num", ., fixed=TRUE)) %>%
rename_all(~sub(" (Alpha)", ".alpha", ., fixed=TRUE)) %>%
rename_all(~sub(" (percentage)", ".per", ., fixed=TRUE)) %>%
mutate_at("ListingCategory.num", as.factor)
# orders factor levels, where appropriate
data$CreditGrade <- ordered(data$CreditGrade, c("NC","HR","E","D","C","B","A","AA"))
data$ProsperRating.alpha <- ordered(data$ProsperRating.alpha, c("NC","HR","E","D","C","B","A","AA"))
data$IncomeRange <- ordered(data$IncomeRange, c("Not displayed","Not employed","$0","$1-24,999","$25,000-49,999","$50,000-74,999","$75,000-99,999","$100,000+"))
data$LoanOriginationQuarter <- ordered(data$LoanOriginationQuarter, c("Q1 2006", "Q2 2006", "Q3 2006", "Q4 2006", "Q1 2007", "Q2 2007", "Q3 2007", "Q4 2007", "Q1 2008", "Q2 2008", "Q3 2008", "Q4 2008", "Q1 2009", "Q2 2009", "Q3 2009", "Q4 2009", "Q1 2010", "Q2 2010", "Q3 2010", "Q4 2010", "Q1 2011", "Q2 2011", "Q3 2011", "Q4 2011", "Q1 2012", "Q2 2012", "Q3 2012", "Q4 2012", "Q1 2013", "Q2 2013", "Q3 2013", "Q4 2013", "Q1 2014", "Q2 2014", "Q3 2014", "Q4 2014"))
str(data)
## Classes 'tbl_df', 'tbl' and 'data.frame': 113937 obs. of 81 variables:
## $ ListingKey : Factor w/ 113066 levels "00003546482094282EF90E5",..: 7180 7193 6647 6669 6686 6689 6699 6706 6687 6687 ...
## $ ListingNumber : int 193129 1209647 81716 658116 909464 1074836 750899 768193 1023355 1023355 ...
## $ ListingCreationDate : Date, format: "2007-08-26" "2014-02-27" ...
## $ CreditGrade : Ord.factor w/ 8 levels "NC"<"HR"<"E"<..: 5 NA 2 NA NA NA NA NA NA NA ...
## $ Term : int 36 36 36 36 36 60 36 36 36 36 ...
## $ LoanStatus : Factor w/ 12 levels "Cancelled","Chargedoff",..: 3 4 3 4 4 4 4 4 4 4 ...
## $ ClosedDate : Date, format: "2009-08-14" NA ...
## $ BorrowerAPR : num 0.165 0.12 0.283 0.125 0.246 ...
## $ BorrowerRate : num 0.158 0.092 0.275 0.0974 0.2085 ...
## $ LenderYield : num 0.138 0.082 0.24 0.0874 0.1985 ...
## $ EstimatedEffectiveYield : num NA 0.0796 NA 0.0849 0.1832 ...
## $ EstimatedLoss : num NA 0.0249 NA 0.0249 0.0925 ...
## $ EstimatedReturn : num NA 0.0547 NA 0.06 0.0907 ...
## $ ProsperRating.num : int NA 6 NA 6 3 5 2 4 7 7 ...
## $ ProsperRating.alpha : Ord.factor w/ 8 levels "NC"<"HR"<"E"<..: NA 7 NA 7 4 6 3 5 8 8 ...
## $ ProsperScore : num NA 7 NA 9 4 10 2 4 9 11 ...
## $ ListingCategory.num : Factor w/ 21 levels "0","1","2","3",..: 1 3 1 17 3 2 2 3 8 8 ...
## $ BorrowerState : Factor w/ 51 levels "AK","AL","AR",..: 6 6 11 11 24 33 17 5 15 15 ...
## $ Occupation : Factor w/ 67 levels "Accountant/CPA",..: 36 42 36 51 20 42 49 28 23 23 ...
## $ EmploymentStatus : Factor w/ 8 levels "Employed","Full-time",..: 8 1 3 1 1 1 1 1 1 1 ...
## $ EmploymentStatusDuration : int 2 44 NA 113 44 82 172 103 269 269 ...
## $ IsBorrowerHomeowner : logi TRUE FALSE FALSE TRUE TRUE TRUE ...
## $ CurrentlyInGroup : logi TRUE FALSE TRUE FALSE FALSE FALSE ...
## $ GroupKey : Factor w/ 706 levels "00343376901312423168731",..: NA NA 334 NA NA NA NA NA NA NA ...
## $ DateCreditPulled : Date, format: "2007-08-26" "2014-02-27" ...
## $ CreditScoreRangeLower : int 640 680 480 800 680 740 680 700 820 820 ...
## $ CreditScoreRangeUpper : int 659 699 499 819 699 759 699 719 839 839 ...
## $ FirstRecordedCreditLine : Date, format: "2001-10-11" "1996-03-18" ...
## $ CurrentCreditLines : int 5 14 NA 5 19 21 10 6 17 17 ...
## $ OpenCreditLines : int 4 14 NA 5 19 17 7 6 16 16 ...
## $ TotalCreditLinespast7years : int 12 29 3 29 49 49 20 10 32 32 ...
## $ OpenRevolvingAccounts : int 1 13 0 7 6 13 6 5 12 12 ...
## $ OpenRevolvingMonthlyPayment : num 24 389 0 115 220 1410 214 101 219 219 ...
## $ InquiriesLast6Months : int 3 3 0 0 1 0 0 3 1 1 ...
## $ TotalInquiries : num 3 5 1 1 9 2 0 16 6 6 ...
## $ CurrentDelinquencies : int 2 0 1 4 0 0 0 0 0 0 ...
## $ AmountDelinquent : num 472 0 NA 10056 0 ...
## $ DelinquenciesLast7Years : int 4 0 0 14 0 0 0 0 0 0 ...
## $ PublicRecordsLast10Years : int 0 1 0 0 0 0 0 1 0 0 ...
## $ PublicRecordsLast12Months : int 0 0 NA 0 0 0 0 0 0 0 ...
## $ RevolvingCreditBalance : num 0 3989 NA 1444 6193 ...
## $ BankcardUtilization : num 0 0.21 NA 0.04 0.81 0.39 0.72 0.13 0.11 0.11 ...
## $ AvailableBankcardCredit : num 1500 10266 NA 30754 695 ...
## $ TotalTrades : num 11 29 NA 26 39 47 16 10 29 29 ...
## $ TradesNeverDelinquent.per : num 0.81 1 NA 0.76 0.95 1 0.68 0.8 1 1 ...
## $ TradesOpenedLast6Months : num 0 2 NA 0 2 0 0 0 1 1 ...
## $ DebtToIncomeRatio : num 0.17 0.18 0.06 0.15 0.26 0.36 0.27 0.24 0.25 0.25 ...
## $ IncomeRange : Ord.factor w/ 8 levels "Not displayed"<..: 5 6 1 5 8 8 5 5 5 5 ...
## $ IncomeVerifiable : logi TRUE TRUE TRUE TRUE TRUE TRUE ...
## $ StatedMonthlyIncome : num 3083 6125 2083 2875 9583 ...
## $ LoanKey : Factor w/ 113066 levels "00003683605746079487FF7",..: 100337 69837 46303 70776 71387 86505 91250 5425 908 908 ...
## $ TotalProsperLoans : int NA NA NA NA 1 NA NA NA NA NA ...
## $ TotalProsperPaymentsBilled : int NA NA NA NA 11 NA NA NA NA NA ...
## $ OnTimeProsperPayments : int NA NA NA NA 11 NA NA NA NA NA ...
## $ ProsperPaymentsLessThanOneMonthLate: int NA NA NA NA 0 NA NA NA NA NA ...
## $ ProsperPaymentsOneMonthPlusLate : int NA NA NA NA 0 NA NA NA NA NA ...
## $ ProsperPrincipalBorrowed : num NA NA NA NA 11000 NA NA NA NA NA ...
## $ ProsperPrincipalOutstanding : num NA NA NA NA 9948 ...
## $ ScorexChangeAtTimeOfListing : int NA NA NA NA NA NA NA NA NA NA ...
## $ LoanCurrentDaysDelinquent : int 0 0 0 0 0 0 0 0 0 0 ...
## $ LoanFirstDefaultedCycleNumber : int NA NA NA NA NA NA NA NA NA NA ...
## $ LoanMonthsSinceOrigination : int 78 0 86 16 6 3 11 10 3 3 ...
## $ LoanNumber : int 19141 134815 6466 77296 102670 123257 88353 90051 121268 121268 ...
## $ LoanOriginalAmount : int 9425 10000 3001 10000 15000 15000 3000 10000 10000 10000 ...
## $ LoanOriginationDate : Date, format: "2007-09-12" "2014-03-03" ...
## $ LoanOriginationQuarter : Ord.factor w/ 36 levels "Q1 2006"<"Q2 2006"<..: 7 33 5 28 31 32 30 30 32 32 ...
## $ MemberKey : Factor w/ 90831 levels "00003397697413387CAF966",..: 11071 10302 33781 54939 19465 48037 60448 40951 26129 26129 ...
## $ MonthlyLoanPayment : num 330 319 123 321 564 ...
## $ LP_CustomerPayments : num 11396 0 4187 5143 2820 ...
## $ LP_CustomerPrincipalPayments : num 9425 0 3001 4091 1563 ...
## $ LP_InterestandFees : num 1971 0 1186 1052 1257 ...
## $ LP_ServiceFees : num -133.2 0 -24.2 -108 -60.3 ...
## $ LP_CollectionFees : num 0 0 0 0 0 0 0 0 0 0 ...
## $ LP_GrossPrincipalLoss : num 0 0 0 0 0 0 0 0 0 0 ...
## $ LP_NetPrincipalLoss : num 0 0 0 0 0 0 0 0 0 0 ...
## $ LP_NonPrincipalRecoverypayments : num 0 0 0 0 0 0 0 0 0 0 ...
## $ PercentFunded : num 1 1 1 1 1 1 1 1 1 1 ...
## $ Recommendations : int 0 0 0 0 0 0 0 0 0 0 ...
## $ InvestmentFromFriendsCount : int 0 0 0 0 0 0 0 0 0 0 ...
## $ InvestmentFromFriendsAmount : num 0 0 0 0 0 0 0 0 0 0 ...
## $ Investors : int 258 1 41 158 20 1 1 1 1 1 ...
Now I want to take a look at a summary of the data, to try to figure out what might be going on:
summary(data)
## ListingKey ListingNumber ListingCreationDate
## 17A93590655669644DB4C06: 6 Min. : 4 Min. :2005-11-09
## 349D3587495831350F0F648: 4 1st Qu.: 400919 1st Qu.:2008-09-19
## 47C1359638497431975670B: 4 Median : 600554 Median :2012-06-16
## 8474358854651984137201C: 4 Mean : 627886 Mean :2011-07-08
## DE8535960513435199406CE: 4 3rd Qu.: 892634 3rd Qu.:2013-09-09
## 04C13599434217079754AEE: 3 Max. :1255725 Max. :2014-03-10
## (Other) :113912
## CreditGrade Term LoanStatus
## C : 5649 Min. :12.00 Current :56576
## D : 5153 1st Qu.:36.00 Completed :38074
## B : 4389 Median :36.00 Chargedoff :11992
## AA : 3509 Mean :40.83 Defaulted : 5018
## HR : 3508 3rd Qu.:36.00 Past Due (1-15 days) : 806
## (Other): 6745 Max. :60.00 Past Due (31-60 days): 363
## NA's :84984 (Other) : 1108
## ClosedDate BorrowerAPR BorrowerRate LenderYield
## Min. :2005-11-25 Min. :0.00653 Min. :0.0000 Min. :-0.0100
## 1st Qu.:2009-07-14 1st Qu.:0.15629 1st Qu.:0.1340 1st Qu.: 0.1242
## Median :2011-04-05 Median :0.20976 Median :0.1840 Median : 0.1730
## Mean :2011-03-07 Mean :0.21883 Mean :0.1928 Mean : 0.1827
## 3rd Qu.:2013-01-30 3rd Qu.:0.28381 3rd Qu.:0.2500 3rd Qu.: 0.2400
## Max. :2014-03-10 Max. :0.51229 Max. :0.4975 Max. : 0.4925
## NA's :58848 NA's :25
## EstimatedEffectiveYield EstimatedLoss EstimatedReturn
## Min. :-0.183 Min. :0.005 Min. :-0.183
## 1st Qu.: 0.116 1st Qu.:0.042 1st Qu.: 0.074
## Median : 0.162 Median :0.072 Median : 0.092
## Mean : 0.169 Mean :0.080 Mean : 0.096
## 3rd Qu.: 0.224 3rd Qu.:0.112 3rd Qu.: 0.117
## Max. : 0.320 Max. :0.366 Max. : 0.284
## NA's :29084 NA's :29084 NA's :29084
## ProsperRating.num ProsperRating.alpha ProsperScore ListingCategory.num
## Min. :1.000 C :18345 Min. : 1.00 1 :58308
## 1st Qu.:3.000 B :15581 1st Qu.: 4.00 0 :16965
## Median :4.000 A :14551 Median : 6.00 7 :10494
## Mean :4.072 D :14274 Mean : 5.95 2 : 7433
## 3rd Qu.:5.000 E : 9795 3rd Qu.: 8.00 3 : 7189
## Max. :7.000 (Other):12307 Max. :11.00 6 : 2572
## NA's :29084 NA's :29084 NA's :29084 (Other):10976
## BorrowerState Occupation EmploymentStatus
## CA :14717 Other :28617 Employed :67322
## TX : 6842 Professional :13628 Full-time :26355
## NY : 6729 Computer Programmer: 4478 Self-employed: 6134
## FL : 6720 Executive : 4311 Not available: 5347
## IL : 5921 Teacher : 3759 Other : 3806
## (Other):67493 (Other) :55556 (Other) : 2718
## NA's : 5515 NA's : 3588 NA's : 2255
## EmploymentStatusDuration IsBorrowerHomeowner CurrentlyInGroup
## Min. : 0.00 Mode :logical Mode :logical
## 1st Qu.: 26.00 FALSE:56459 FALSE:101218
## Median : 67.00 TRUE :57478 TRUE :12719
## Mean : 96.07
## 3rd Qu.:137.00
## Max. :755.00
## NA's :7625
## GroupKey DateCreditPulled
## 783C3371218786870A73D20: 1140 Min. :2005-11-09
## 3D4D3366260257624AB272D: 916 1st Qu.:2008-09-16
## 6A3B336601725506917317E: 698 Median :2012-06-17
## FEF83377364176536637E50: 611 Mean :2011-07-09
## C9643379247860156A00EC0: 342 3rd Qu.:2013-09-11
## (Other) : 9634 Max. :2014-03-10
## NA's :100596
## CreditScoreRangeLower CreditScoreRangeUpper FirstRecordedCreditLine
## Min. : 0.0 Min. : 19.0 Min. :1947-08-24
## 1st Qu.:660.0 1st Qu.:679.0 1st Qu.:1990-06-01
## Median :680.0 Median :699.0 Median :1995-11-01
## Mean :685.6 Mean :704.6 Mean :1994-11-17
## 3rd Qu.:720.0 3rd Qu.:739.0 3rd Qu.:2000-03-14
## Max. :880.0 Max. :899.0 Max. :2012-12-22
## NA's :591 NA's :591 NA's :697
## CurrentCreditLines OpenCreditLines TotalCreditLinespast7years
## Min. : 0.00 Min. : 0.00 Min. : 2.00
## 1st Qu.: 7.00 1st Qu.: 6.00 1st Qu.: 17.00
## Median :10.00 Median : 9.00 Median : 25.00
## Mean :10.32 Mean : 9.26 Mean : 26.75
## 3rd Qu.:13.00 3rd Qu.:12.00 3rd Qu.: 35.00
## Max. :59.00 Max. :54.00 Max. :136.00
## NA's :7604 NA's :7604 NA's :697
## OpenRevolvingAccounts OpenRevolvingMonthlyPayment InquiriesLast6Months
## Min. : 0.00 Min. : 0.0 Min. : 0.000
## 1st Qu.: 4.00 1st Qu.: 114.0 1st Qu.: 0.000
## Median : 6.00 Median : 271.0 Median : 1.000
## Mean : 6.97 Mean : 398.3 Mean : 1.435
## 3rd Qu.: 9.00 3rd Qu.: 525.0 3rd Qu.: 2.000
## Max. :51.00 Max. :14985.0 Max. :105.000
## NA's :697
## TotalInquiries CurrentDelinquencies AmountDelinquent
## Min. : 0.000 Min. : 0.0000 Min. : 0.0
## 1st Qu.: 2.000 1st Qu.: 0.0000 1st Qu.: 0.0
## Median : 4.000 Median : 0.0000 Median : 0.0
## Mean : 5.584 Mean : 0.5921 Mean : 984.5
## 3rd Qu.: 7.000 3rd Qu.: 0.0000 3rd Qu.: 0.0
## Max. :379.000 Max. :83.0000 Max. :463881.0
## NA's :1159 NA's :697 NA's :7622
## DelinquenciesLast7Years PublicRecordsLast10Years
## Min. : 0.000 Min. : 0.0000
## 1st Qu.: 0.000 1st Qu.: 0.0000
## Median : 0.000 Median : 0.0000
## Mean : 4.155 Mean : 0.3126
## 3rd Qu.: 3.000 3rd Qu.: 0.0000
## Max. :99.000 Max. :38.0000
## NA's :990 NA's :697
## PublicRecordsLast12Months RevolvingCreditBalance BankcardUtilization
## Min. : 0.000 Min. : 0 Min. :0.000
## 1st Qu.: 0.000 1st Qu.: 3121 1st Qu.:0.310
## Median : 0.000 Median : 8549 Median :0.600
## Mean : 0.015 Mean : 17599 Mean :0.561
## 3rd Qu.: 0.000 3rd Qu.: 19521 3rd Qu.:0.840
## Max. :20.000 Max. :1435667 Max. :5.950
## NA's :7604 NA's :7604 NA's :7604
## AvailableBankcardCredit TotalTrades TradesNeverDelinquent.per
## Min. : 0 Min. : 0.00 Min. :0.000
## 1st Qu.: 880 1st Qu.: 15.00 1st Qu.:0.820
## Median : 4100 Median : 22.00 Median :0.940
## Mean : 11210 Mean : 23.23 Mean :0.886
## 3rd Qu.: 13180 3rd Qu.: 30.00 3rd Qu.:1.000
## Max. :646285 Max. :126.00 Max. :1.000
## NA's :7544 NA's :7544 NA's :7544
## TradesOpenedLast6Months DebtToIncomeRatio IncomeRange
## Min. : 0.000 Min. : 0.000 $25,000-49,999:32192
## 1st Qu.: 0.000 1st Qu.: 0.140 $50,000-74,999:31050
## Median : 0.000 Median : 0.220 $100,000+ :17337
## Mean : 0.802 Mean : 0.276 $75,000-99,999:16916
## 3rd Qu.: 1.000 3rd Qu.: 0.320 Not displayed : 7741
## Max. :20.000 Max. :10.010 $1-24,999 : 7274
## NA's :7544 NA's :8554 (Other) : 1427
## IncomeVerifiable StatedMonthlyIncome LoanKey
## Mode :logical Min. : 0 CB1B37030986463208432A1: 6
## FALSE:8669 1st Qu.: 3200 2DEE3698211017519D7333F: 4
## TRUE :105268 Median : 4667 9F4B37043517554537C364C: 4
## Mean : 5608 D895370150591392337ED6D: 4
## 3rd Qu.: 6825 E6FB37073953690388BC56D: 4
## Max. :1750003 0D8F37036734373301ED419: 3
## (Other) :113912
## TotalProsperLoans TotalProsperPaymentsBilled OnTimeProsperPayments
## Min. :0.00 Min. : 0.00 Min. : 0.00
## 1st Qu.:1.00 1st Qu.: 9.00 1st Qu.: 9.00
## Median :1.00 Median : 16.00 Median : 15.00
## Mean :1.42 Mean : 22.93 Mean : 22.27
## 3rd Qu.:2.00 3rd Qu.: 33.00 3rd Qu.: 32.00
## Max. :8.00 Max. :141.00 Max. :141.00
## NA's :91852 NA's :91852 NA's :91852
## ProsperPaymentsLessThanOneMonthLate ProsperPaymentsOneMonthPlusLate
## Min. : 0.00 Min. : 0.00
## 1st Qu.: 0.00 1st Qu.: 0.00
## Median : 0.00 Median : 0.00
## Mean : 0.61 Mean : 0.05
## 3rd Qu.: 0.00 3rd Qu.: 0.00
## Max. :42.00 Max. :21.00
## NA's :91852 NA's :91852
## ProsperPrincipalBorrowed ProsperPrincipalOutstanding
## Min. : 0 Min. : 0
## 1st Qu.: 3500 1st Qu.: 0
## Median : 6000 Median : 1627
## Mean : 8472 Mean : 2930
## 3rd Qu.:11000 3rd Qu.: 4127
## Max. :72499 Max. :23451
## NA's :91852 NA's :91852
## ScorexChangeAtTimeOfListing LoanCurrentDaysDelinquent
## Min. :-209.00 Min. : 0.0
## 1st Qu.: -35.00 1st Qu.: 0.0
## Median : -3.00 Median : 0.0
## Mean : -3.22 Mean : 152.8
## 3rd Qu.: 25.00 3rd Qu.: 0.0
## Max. : 286.00 Max. :2704.0
## NA's :95009
## LoanFirstDefaultedCycleNumber LoanMonthsSinceOrigination LoanNumber
## Min. : 0.00 Min. : 0.0 Min. : 1
## 1st Qu.: 9.00 1st Qu.: 6.0 1st Qu.: 37332
## Median :14.00 Median : 21.0 Median : 68599
## Mean :16.27 Mean : 31.9 Mean : 69444
## 3rd Qu.:22.00 3rd Qu.: 65.0 3rd Qu.:101901
## Max. :44.00 Max. :100.0 Max. :136486
## NA's :96985
## LoanOriginalAmount LoanOriginationDate LoanOriginationQuarter
## Min. : 1000 Min. :2005-11-15 Q4 2013:14450
## 1st Qu.: 4000 1st Qu.:2008-10-02 Q1 2014:12172
## Median : 6500 Median :2012-06-26 Q3 2013: 9180
## Mean : 8337 Mean :2011-07-21 Q2 2013: 7099
## 3rd Qu.:12000 3rd Qu.:2013-09-18 Q3 2012: 5632
## Max. :35000 Max. :2014-03-12 (Other):65382
## NA's : 22
## MemberKey MonthlyLoanPayment LP_CustomerPayments
## 63CA34120866140639431C9: 9 Min. : 0.0 Min. : -2.35
## 16083364744933457E57FB9: 8 1st Qu.: 131.6 1st Qu.: 1005.76
## 3A2F3380477699707C81385: 8 Median : 217.7 Median : 2583.83
## 4D9C3403302047712AD0CDD: 8 Mean : 272.5 Mean : 4183.08
## 739C338135235294782AE75: 8 3rd Qu.: 371.6 3rd Qu.: 5548.40
## 7E1733653050264822FAA3D: 8 Max. :2251.5 Max. :40702.39
## (Other) :113888
## LP_CustomerPrincipalPayments LP_InterestandFees LP_ServiceFees
## Min. : 0.0 Min. : -2.35 Min. :-664.87
## 1st Qu.: 500.9 1st Qu.: 274.87 1st Qu.: -73.18
## Median : 1587.5 Median : 700.84 Median : -34.44
## Mean : 3105.5 Mean : 1077.54 Mean : -54.73
## 3rd Qu.: 4000.0 3rd Qu.: 1458.54 3rd Qu.: -13.92
## Max. :35000.0 Max. :15617.03 Max. : 32.06
##
## LP_CollectionFees LP_GrossPrincipalLoss LP_NetPrincipalLoss
## Min. :-9274.75 Min. : -94.2 Min. : -954.5
## 1st Qu.: 0.00 1st Qu.: 0.0 1st Qu.: 0.0
## Median : 0.00 Median : 0.0 Median : 0.0
## Mean : -14.24 Mean : 700.4 Mean : 681.4
## 3rd Qu.: 0.00 3rd Qu.: 0.0 3rd Qu.: 0.0
## Max. : 0.00 Max. :25000.0 Max. :25000.0
##
## LP_NonPrincipalRecoverypayments PercentFunded Recommendations
## Min. : 0.00 Min. :0.7000 Min. : 0.00000
## 1st Qu.: 0.00 1st Qu.:1.0000 1st Qu.: 0.00000
## Median : 0.00 Median :1.0000 Median : 0.00000
## Mean : 25.14 Mean :0.9986 Mean : 0.04803
## 3rd Qu.: 0.00 3rd Qu.:1.0000 3rd Qu.: 0.00000
## Max. :21117.90 Max. :1.0125 Max. :39.00000
##
## InvestmentFromFriendsCount InvestmentFromFriendsAmount Investors
## Min. : 0.00000 Min. : 0.00 Min. : 1.00
## 1st Qu.: 0.00000 1st Qu.: 0.00 1st Qu.: 2.00
## Median : 0.00000 Median : 0.00 Median : 44.00
## Mean : 0.02346 Mean : 16.55 Mean : 80.48
## 3rd Qu.: 0.00000 3rd Qu.: 0.00 3rd Qu.: 115.00
## Max. :33.00000 Max. :25000.00 Max. :1189.00
##
I first see that there’s a lot of missing data in many of the columns - it’s not clear to me immediately whether this indicates that the data for those rows was simply never collected, or if the information in those columns was not applicable to the observations in those rows. In some cases, this is indicated in the document linked in the section below, which lists and describes the variables in this data set. I will sort this out as I move through the data, but I want to see if some information is, for example, only entered once the loan has been closed or completed. First, though, I will identify the factors of interest.
Prosper Loans, through very cursory research (https://en.wikipedia.org/wiki/Prosper_Marketplace), appears to be a peer-to-peer lending company. The primary concern of companies is profit, and in this case, as I see no obvious measure of profit to the company itself, I will focus on profit to the lender (the lenders, presumably, keep the company in business). Of course, borrowers also keep the company in business, and given the measures collected, it’s possible to at least take a look at how borrower demographics influence loan funding, payments, and interest rates. Variable names are cross-referenced with a document linked from the Kaggle data page: https://docs.google.com/spreadsheets/d/1gDyi_L4UvIrLTEC6Wri5nbaMmkGmLQBk-Yx3z0XDEtI/edit#gid=0.
The variables of most interest to lenders, I assume, might be (for example) LoanStatus (whether a loan is in good standing, repaid, or written off, etc.), LenderYield (yield minus servicing fee), EstimatedEffectiveYield (yield minus servicing fee and uncollected interest, and plus late fees) - likely more informative than the preceding, EstimatedReturn (overall estimated return, taking into account both estimated yield, and estimated loss), EstimatedLoss (loss on charge-offs), LoanCurrentDaysDelinquent, LP_GrossPrincipalLoss, and LP_NetPrincipalLoss. These seem most indicative of how much lenders might profit, or lose, from any particular borrower. What the lender should care most about, overall, is the ability to predict whether (or to what degree) a given (current or future) loan will pay off. In some cases, it is unclear from the documentation whether these are predictions assigned by Prosper at the outset, or descriptions of what actually happened during the course of loan repayment Exploring the data might shed some light on this.
On the other hand, the variables I intuitively expect might be predictive of profit are the following (for example): CreditGrade (credit assigned when the listing went live), ProsperRating (rating assigned when the loan went live), ProsperScore (risk score), EstimatedReturn (predicted difference between estimated effective yield and estimated loss), Occupation, EmploymentStatus, EmploymentStatusDuration, IsBorrowerHomeowner, CreditScoreRangeLower/CreditScoreRangeUpper, FirstRecordedCreditLine, CurrentCreditLines, OpenCreditLines, TotalCreditLinespast7years, OpenRevolvingAccounts, OpenRevolvingMonthlyPayment, InquiriesLast6Months, TotalInquiries, CurrentDelinquencies, AmountDelinquent, DelinquenciesLast7Years, PublicRecordsLast10Years, PublicRecordsLast12Months, RevolvingCreditBalance, BankcardUtilization, AvailableBankcardCredit, TotalTrades (number of trade lines ever opened), TradesNeverDelinquent, TradesOpenedLast6Months, DebtToIncomeRatio, IncomeRange, IncomeVerifiable, StatedMonthlyIncome, TotalProsperLoans (prior Prosper loans), TotalProsperPaymentsBilled (presumably, number of payments billed at time of listing), OnTimeProsperPayments (number of on-time payments at time of listing), ProsperPaymentsLessThanOneMonthLate, ProsperPaymentsOneMonthPlusLate, ProsperPrincipalBorrowed (amount borrowed at time of listing), ProsperPrincipalOutstanding (amount outstanding at time of listing), Recommendations (number of recommendations at time of listing), InvestmentFromFriendsCount (number of friends investing), andInvestmentFromFriendsAmount (amount invested by friends), and Investors (total number of investors). There are too many variables to look at up front, and I expect to narrow the list I will look at down to a portion of these, particularly when multiple measures appear likely to reflect more-or-less the same thing.
With respect to loan funding, some of the same predictors likely also influence approved loan amounts and borrower funding, as most likely reflected by BorrowerAPR, BorrowerRate, LoanOriginalAmount, MonthlyLoanPayment, Term (the length of the loan), and PercentFunded (although this is likely to not be informative for recently created loans).
The borrowers and loans are primary indexed through the variables MemberKey and LoanNumber. Additional variables for keeping track of loans include LoanOriginationDate and LoanOriginationQuarter. ClosedDate is useful for quickly indexing loans which have been closed, and for which firm conclusions can be drawn as to how much lenders profited.
Here I want to look more closely at why information might be missing (e.g., whether some variables are assigned a value only once a loan has been closed).
# prints out two columns with the percent of data missing, per variable, in portions of the data set with either closed loans, or open loans
closed <- round(colMeans(is.na(filter(data, !is.na(ClosedDate))))*100,2)
not_closed <- round(colMeans(is.na(filter(data, is.na(ClosedDate))))*100,2)
data.frame(closed, not_closed)
## closed not_closed
## ListingKey 0.00 0.00
## ListingNumber 0.00 0.00
## ListingCreationDate 0.00 0.00
## CreditGrade 47.44 100.00
## Term 0.00 0.00
## LoanStatus 0.00 0.00
## ClosedDate 0.00 100.00
## BorrowerAPR 0.05 0.00
## BorrowerRate 0.00 0.00
## LenderYield 0.00 0.00
## EstimatedEffectiveYield 52.79 0.00
## EstimatedLoss 52.79 0.00
## EstimatedReturn 52.79 0.00
## ProsperRating.num 52.79 0.00
## ProsperRating.alpha 52.79 0.00
## ProsperScore 52.79 0.00
## ListingCategory.num 0.00 0.00
## BorrowerState 10.01 0.00
## Occupation 4.12 2.24
## EmploymentStatus 4.09 0.00
## EmploymentStatusDuration 13.82 0.02
## IsBorrowerHomeowner 0.00 0.00
## CurrentlyInGroup 0.00 0.00
## GroupKey 77.00 98.86
## DateCreditPulled 0.00 0.00
## CreditScoreRangeLower 1.07 0.00
## CreditScoreRangeUpper 1.07 0.00
## FirstRecordedCreditLine 1.27 0.00
## CurrentCreditLines 13.80 0.00
## OpenCreditLines 13.80 0.00
## TotalCreditLinespast7years 1.27 0.00
## OpenRevolvingAccounts 0.00 0.00
## OpenRevolvingMonthlyPayment 0.00 0.00
## InquiriesLast6Months 1.27 0.00
## TotalInquiries 2.10 0.00
## CurrentDelinquencies 1.27 0.00
## AmountDelinquent 13.84 0.00
## DelinquenciesLast7Years 1.80 0.00
## PublicRecordsLast10Years 1.27 0.00
## PublicRecordsLast12Months 13.80 0.00
## RevolvingCreditBalance 13.80 0.00
## BankcardUtilization 13.80 0.00
## AvailableBankcardCredit 13.69 0.00
## TotalTrades 13.69 0.00
## TradesNeverDelinquent.per 13.69 0.00
## TradesOpenedLast6Months 13.69 0.00
## DebtToIncomeRatio 7.68 7.35
## IncomeRange 0.00 0.00
## IncomeVerifiable 0.00 0.00
## StatedMonthlyIncome 0.00 0.00
## LoanKey 0.00 0.00
## TotalProsperLoans 80.87 80.38
## TotalProsperPaymentsBilled 80.87 80.38
## OnTimeProsperPayments 80.87 80.38
## ProsperPaymentsLessThanOneMonthLate 80.87 80.38
## ProsperPaymentsOneMonthPlusLate 80.87 80.38
## ProsperPrincipalBorrowed 80.87 80.38
## ProsperPrincipalOutstanding 80.87 80.38
## ScorexChangeAtTimeOfListing 81.05 85.58
## LoanCurrentDaysDelinquent 0.00 0.00
## LoanFirstDefaultedCycleNumber 69.24 99.99
## LoanMonthsSinceOrigination 0.00 0.00
## LoanNumber 0.00 0.00
## LoanOriginalAmount 0.00 0.00
## LoanOriginationDate 0.00 0.00
## LoanOriginationQuarter 0.04 0.00
## MemberKey 0.00 0.00
## MonthlyLoanPayment 0.00 0.00
## LP_CustomerPayments 0.00 0.00
## LP_CustomerPrincipalPayments 0.00 0.00
## LP_InterestandFees 0.00 0.00
## LP_ServiceFees 0.00 0.00
## LP_CollectionFees 0.00 0.00
## LP_GrossPrincipalLoss 0.00 0.00
## LP_NetPrincipalLoss 0.00 0.00
## LP_NonPrincipalRecoverypayments 0.00 0.00
## PercentFunded 0.00 0.00
## Recommendations 0.00 0.00
## InvestmentFromFriendsCount 0.00 0.00
## InvestmentFromFriendsAmount 0.00 0.00
## Investors 0.00 0.00
The first thing I notice is that whether a loan is closed, or not, is quite, but in most cases not entirely, predictive of whether missing values are present, or not.
Only a few of these values are missing, but this data should be present. Here, it is clear from the listing creation date and loan origination date that the missing values are from the last quarter of 2005. I will assign the NA cells the value "Q4 2005".
summary(filter(data, is.na(LoanOriginationQuarter)))
## ListingKey ListingNumber ListingCreationDate
## 044B3365298516680DA929B: 1 Min. : 4.00 Min. :2005-11-09
## 0B4133652604109810CAA3B: 1 1st Qu.:18.25 1st Qu.:2005-11-18
## 0E0F336443449038617E9F4: 1 Median :23.50 Median :2005-11-20
## 2F123364529418907A58D4C: 1 Mean :26.05 Mean :2005-11-22
## 2F25336514614362295DA03: 1 3rd Qu.:35.75 3rd Qu.:2005-11-28
## 3480336511078238810A782: 1 Max. :59.00 Max. :2005-12-21
## (Other) :16
## CreditGrade Term LoanStatus
## AA :12 Min. :36 Completed :22
## HR : 2 1st Qu.:36 Cancelled : 0
## C : 2 Median :36 Chargedoff : 0
## B : 2 Mean :36 Current : 0
## NC : 1 3rd Qu.:36 Defaulted : 0
## E : 1 Max. :36 FinalPaymentInProgress: 0
## (Other): 2 (Other) : 0
## ClosedDate BorrowerAPR BorrowerRate LenderYield
## Min. :2005-11-25 Min. : NA Min. :0.04000 Min. :0.03500
## 1st Qu.:2005-12-02 1st Qu.: NA 1st Qu.:0.06099 1st Qu.:0.05500
## Median :2006-01-06 Median : NA Median :0.08500 Median :0.07500
## Mean :2006-08-15 Mean :NaN Mean :0.09370 Mean :0.08308
## 3rd Qu.:2006-08-30 3rd Qu.: NA 3rd Qu.:0.11500 3rd Qu.:0.09500
## Max. :2008-12-30 Max. : NA Max. :0.25000 Max. :0.24500
## NA's :22
## EstimatedEffectiveYield EstimatedLoss EstimatedReturn ProsperRating.num
## Min. : NA Min. : NA Min. : NA Min. : NA
## 1st Qu.: NA 1st Qu.: NA 1st Qu.: NA 1st Qu.: NA
## Median : NA Median : NA Median : NA Median : NA
## Mean :NaN Mean :NaN Mean :NaN Mean :NaN
## 3rd Qu.: NA 3rd Qu.: NA 3rd Qu.: NA 3rd Qu.: NA
## Max. : NA Max. : NA Max. : NA Max. : NA
## NA's :22 NA's :22 NA's :22 NA's :22
## ProsperRating.alpha ProsperScore ListingCategory.num BorrowerState
## NC : 0 Min. : NA 0 :22 AK : 0
## HR : 0 1st Qu.: NA 1 : 0 AL : 0
## E : 0 Median : NA 2 : 0 AR : 0
## D : 0 Mean :NaN 3 : 0 AZ : 0
## C : 0 3rd Qu.: NA 4 : 0 CA : 0
## (Other): 0 Max. : NA 5 : 0 (Other): 0
## NA's :22 NA's :22 (Other): 0 NA's :22
## Occupation EmploymentStatus
## Accountant/CPA : 0 Employed : 0
## Administrative Assistant: 0 Full-time : 0
## Analyst : 0 Not available: 0
## Architect : 0 Not employed : 0
## Attorney : 0 Other : 0
## (Other) : 0 (Other) : 0
## NA's :22 NA's :22
## EmploymentStatusDuration IsBorrowerHomeowner CurrentlyInGroup
## Min. : NA Mode :logical Mode :logical
## 1st Qu.: NA FALSE:22 FALSE:22
## Median : NA
## Mean :NaN
## 3rd Qu.: NA
## Max. : NA
## NA's :22
## GroupKey DateCreditPulled CreditScoreRangeLower
## B8143364846229046768A83:4 Min. :2005-11-09 Min. : NA
## 12D7336581480170815332C:2 1st Qu.:2005-11-16 1st Qu.: NA
## 5BE63365249159793785758:2 Median :2005-11-18 Median : NA
## F0B53365823807576457B84:2 Mean :2005-11-21 Mean :NaN
## 94E9336577086235891524E:1 3rd Qu.:2005-11-28 3rd Qu.: NA
## (Other) :4 Max. :2005-12-20 Max. : NA
## NA's :7 NA's :22
## CreditScoreRangeUpper FirstRecordedCreditLine CurrentCreditLines
## Min. : NA Min. :NA Min. : NA
## 1st Qu.: NA 1st Qu.:NA 1st Qu.: NA
## Median : NA Median :NA Median : NA
## Mean :NaN Mean :NA Mean :NaN
## 3rd Qu.: NA 3rd Qu.:NA 3rd Qu.: NA
## Max. : NA Max. :NA Max. : NA
## NA's :22 NA's :22 NA's :22
## OpenCreditLines TotalCreditLinespast7years OpenRevolvingAccounts
## Min. : NA Min. : NA Min. :0
## 1st Qu.: NA 1st Qu.: NA 1st Qu.:0
## Median : NA Median : NA Median :0
## Mean :NaN Mean :NaN Mean :0
## 3rd Qu.: NA 3rd Qu.: NA 3rd Qu.:0
## Max. : NA Max. : NA Max. :0
## NA's :22 NA's :22
## OpenRevolvingMonthlyPayment InquiriesLast6Months TotalInquiries
## Min. :0 Min. : NA Min. : NA
## 1st Qu.:0 1st Qu.: NA 1st Qu.: NA
## Median :0 Median : NA Median : NA
## Mean :0 Mean :NaN Mean :NaN
## 3rd Qu.:0 3rd Qu.: NA 3rd Qu.: NA
## Max. :0 Max. : NA Max. : NA
## NA's :22 NA's :22
## CurrentDelinquencies AmountDelinquent DelinquenciesLast7Years
## Min. : NA Min. : NA Min. : NA
## 1st Qu.: NA 1st Qu.: NA 1st Qu.: NA
## Median : NA Median : NA Median : NA
## Mean :NaN Mean :NaN Mean :NaN
## 3rd Qu.: NA 3rd Qu.: NA 3rd Qu.: NA
## Max. : NA Max. : NA Max. : NA
## NA's :22 NA's :22 NA's :22
## PublicRecordsLast10Years PublicRecordsLast12Months RevolvingCreditBalance
## Min. : NA Min. : NA Min. : NA
## 1st Qu.: NA 1st Qu.: NA 1st Qu.: NA
## Median : NA Median : NA Median : NA
## Mean :NaN Mean :NaN Mean :NaN
## 3rd Qu.: NA 3rd Qu.: NA 3rd Qu.: NA
## Max. : NA Max. : NA Max. : NA
## NA's :22 NA's :22 NA's :22
## BankcardUtilization AvailableBankcardCredit TotalTrades
## Min. : NA Min. : NA Min. : NA
## 1st Qu.: NA 1st Qu.: NA 1st Qu.: NA
## Median : NA Median : NA Median : NA
## Mean :NaN Mean :NaN Mean :NaN
## 3rd Qu.: NA 3rd Qu.: NA 3rd Qu.: NA
## Max. : NA Max. : NA Max. : NA
## NA's :22 NA's :22 NA's :22
## TradesNeverDelinquent.per TradesOpenedLast6Months DebtToIncomeRatio
## Min. : NA Min. : NA Min. :0.01051
## 1st Qu.: NA 1st Qu.: NA 1st Qu.:0.01569
## Median : NA Median : NA Median :0.02714
## Mean :NaN Mean :NaN Mean :0.07178
## 3rd Qu.: NA 3rd Qu.: NA 3rd Qu.:0.07534
## Max. : NA Max. : NA Max. :0.38380
## NA's :22 NA's :22
## IncomeRange IncomeVerifiable StatedMonthlyIncome
## Not displayed :22 Mode:logical Min. : 1083
## Not employed : 0 TRUE:22 1st Qu.: 6771
## $0 : 0 Median : 9312
## $1-24,999 : 0 Mean :11123
## $25,000-49,999: 0 3rd Qu.:14062
## $50,000-74,999: 0 Max. :29167
## (Other) : 0
## LoanKey TotalProsperLoans TotalProsperPaymentsBilled
## 051C3366339161583A81E4D: 1 Min. : NA Min. : NA
## 11463365963100969351D1D: 1 1st Qu.: NA 1st Qu.: NA
## 30FD3365652573455326F15: 1 Median : NA Median : NA
## 31AC3364816494648054FCB: 1 Mean :NaN Mean :NaN
## 32233364725508802D1C433: 1 3rd Qu.: NA 3rd Qu.: NA
## 335E3365194260894C5E804: 1 Max. : NA Max. : NA
## (Other) :16 NA's :22 NA's :22
## OnTimeProsperPayments ProsperPaymentsLessThanOneMonthLate
## Min. : NA Min. : NA
## 1st Qu.: NA 1st Qu.: NA
## Median : NA Median : NA
## Mean :NaN Mean :NaN
## 3rd Qu.: NA 3rd Qu.: NA
## Max. : NA Max. : NA
## NA's :22 NA's :22
## ProsperPaymentsOneMonthPlusLate ProsperPrincipalBorrowed
## Min. : NA Min. : NA
## 1st Qu.: NA 1st Qu.: NA
## Median : NA Median : NA
## Mean :NaN Mean :NaN
## 3rd Qu.: NA 3rd Qu.: NA
## Max. : NA Max. : NA
## NA's :22 NA's :22
## ProsperPrincipalOutstanding ScorexChangeAtTimeOfListing
## Min. : NA Min. : NA
## 1st Qu.: NA 1st Qu.: NA
## Median : NA Median : NA
## Mean :NaN Mean :NaN
## 3rd Qu.: NA 3rd Qu.: NA
## Max. : NA Max. : NA
## NA's :22 NA's :22
## LoanCurrentDaysDelinquent LoanFirstDefaultedCycleNumber
## Min. :0 Min. : NA
## 1st Qu.:0 1st Qu.: NA
## Median :0 Median : NA
## Mean :0 Mean :NaN
## 3rd Qu.:0 3rd Qu.: NA
## Max. :0 Max. : NA
## NA's :22
## LoanMonthsSinceOrigination LoanNumber LoanOriginalAmount
## Min. : 99.00 Min. : 1.00 Min. : 1000
## 1st Qu.: 99.00 1st Qu.: 6.25 1st Qu.: 1500
## Median :100.00 Median :11.50 Median : 3000
## Mean : 99.59 Mean :11.50 Mean : 3577
## 3rd Qu.:100.00 3rd Qu.:16.75 3rd Qu.: 4150
## Max. :100.00 Max. :22.00 Max. :15000
##
## LoanOriginationDate LoanOriginationQuarter MemberKey
## Min. :2005-11-15 Q1 2006: 0 D3123364665672102D89C63: 2
## 1st Qu.:2005-11-25 Q2 2006: 0 0A8633658381202043D0226: 1
## Median :2005-11-28 Q3 2006: 0 0FE0336637558007610834C: 1
## Mean :2005-12-01 Q4 2006: 0 10983364491040266AF6111: 1
## 3rd Qu.:2005-12-07 Q1 2007: 0 12C53364471219226F478E8: 1
## Max. :2005-12-30 (Other): 0 4C9A3364566879406D66E65: 1
## NA's :22 (Other) :15
## MonthlyLoanPayment LP_CustomerPayments LP_CustomerPrincipalPayments
## Min. : 0.00 Min. : 1000 Min. : 1000
## 1st Qu.: 33.82 1st Qu.: 1517 1st Qu.: 1500
## Median : 85.27 Median : 3006 Median : 3000
## Mean :102.43 Mean : 3772 Mean : 3577
## 3rd Qu.:121.65 3rd Qu.: 4586 3rd Qu.: 4150
## Max. :498.21 Max. :16446 Max. :15000
##
## LP_InterestandFees LP_ServiceFees LP_CollectionFees
## Min. : 0.330 Min. :-69.170 Min. :-123.323
## 1st Qu.: 1.725 1st Qu.: -6.713 1st Qu.: 0.000
## Median : 18.085 Median : -1.490 Median : 0.000
## Mean : 195.287 Mean : -9.947 Mean : -5.606
## 3rd Qu.: 90.547 3rd Qu.: -0.880 3rd Qu.: 0.000
## Max. :1445.530 Max. : -0.330 Max. : 0.000
##
## LP_GrossPrincipalLoss LP_NetPrincipalLoss LP_NonPrincipalRecoverypayments
## Min. :0 Min. :0 Min. :0
## 1st Qu.:0 1st Qu.:0 1st Qu.:0
## Median :0 Median :0 Median :0
## Mean :0 Mean :0 Mean :0
## 3rd Qu.:0 3rd Qu.:0 3rd Qu.:0
## Max. :0 Max. :0 Max. :0
##
## PercentFunded Recommendations InvestmentFromFriendsCount
## Min. :1.000 Min. :0 Min. :0
## 1st Qu.:1.000 1st Qu.:0 1st Qu.:0
## Median :1.000 Median :0 Median :0
## Mean :1.000 Mean :0 Mean :0
## 3rd Qu.:1.000 3rd Qu.:0 3rd Qu.:0
## Max. :1.011 Max. :0 Max. :0
##
## InvestmentFromFriendsAmount Investors
## Min. :0 Min. : 1.000
## 1st Qu.:0 1st Qu.: 3.000
## Median :0 Median : 5.000
## Mean :0 Mean : 5.045
## 3rd Qu.:0 3rd Qu.: 6.000
## Max. :0 Max. :14.000
##
data$LoanOriginationQuarter <- fct_explicit_na(data$LoanOriginationQuarter, "Q4 2005")
# reorders the loan origination quarter levels
data$LoanOriginationQuarter <- ordered(data$LoanOriginationQuarter, c("Q4 2005", "Q1 2006", "Q2 2006", "Q3 2006", "Q4 2006", "Q1 2007", "Q2 2007", "Q3 2007", "Q4 2007", "Q1 2008", "Q2 2008", "Q3 2008", "Q4 2008", "Q1 2009", "Q2 2009", "Q3 2009", "Q4 2009", "Q1 2010", "Q2 2010", "Q3 2010", "Q4 2010", "Q1 2011", "Q2 2011", "Q3 2011", "Q4 2011", "Q1 2012", "Q2 2012", "Q3 2012", "Q4 2012", "Q1 2013", "Q2 2013", "Q3 2013", "Q4 2013", "Q1 2014", "Q2 2014", "Q3 2014", "Q4 2014"))
None of the open loans have a credit grade, while about half of the closed loans do. I assume that those which do are post-July 2009 loans, which were never assigned a credit grade.
summary(filter(data, !is.na(ClosedDate) &
is.na(CreditGrade)))
## ListingKey ListingNumber ListingCreationDate
## 018A360063948152589C8BE: 2 Min. : 149172 Min. :2007-06-08
## 30F435938764424435A1188: 2 1st Qu.: 479472 1st Qu.:2010-10-12
## 32943590099161153292459: 2 Median : 529900 Median :2011-09-28
## 6DFC3591891372387BB41B2: 2 Mean : 554859 Mean :2011-08-17
## 778D35919242972923313E0: 2 3rd Qu.: 600118 3rd Qu.:2012-06-14
## 82FD35914405776692938D4: 2 Max. :1204824 Max. :2014-02-13
## (Other) :26124
## CreditGrade Term LoanStatus
## NC : 0 Min. :12.00 Completed :19786
## HR : 0 1st Qu.:36.00 Chargedoff : 5342
## E : 0 Median :36.00 Defaulted : 1008
## D : 0 Mean :37.99 Cancelled : 0
## C : 0 3rd Qu.:36.00 Current : 0
## (Other): 0 Max. :60.00 FinalPaymentInProgress: 0
## NA's :26136 (Other) : 0
## ClosedDate BorrowerAPR BorrowerRate LenderYield
## Min. :2009-08-27 Min. :0.04583 Min. :0.0400 Min. :0.0300
## 1st Qu.:2012-06-12 1st Qu.:0.17359 1st Qu.:0.1469 1st Qu.:0.1369
## Median :2013-02-20 Median :0.26798 Median :0.2300 Median :0.2200
## Mean :2012-12-20 Mean :0.25118 Mean :0.2193 Mean :0.2093
## 3rd Qu.:2013-09-10 3rd Qu.:0.33553 3rd Qu.:0.2958 3rd Qu.:0.2858
## Max. :2014-03-10 Max. :0.42395 Max. :0.3600 Max. :0.3400
##
## EstimatedEffectiveYield EstimatedLoss EstimatedReturn
## Min. :-0.1827 Min. :0.00490 Min. :-0.1827
## 1st Qu.: 0.1106 1st Qu.:0.05200 1st Qu.: 0.0780
## Median : 0.1715 Median :0.09800 Median : 0.1144
## Mean : 0.1762 Mean :0.09379 Mean : 0.1075
## 3rd Qu.: 0.2469 3rd Qu.:0.14050 3rd Qu.: 0.1363
## Max. : 0.3199 Max. :0.36600 Max. : 0.2837
## NA's :131 NA's :131 NA's :131
## ProsperRating.num ProsperRating.alpha ProsperScore
## Min. :1.000 D :5869 Min. : 1.000
## 1st Qu.:2.000 E :3830 1st Qu.: 5.000
## Median :3.000 C :3817 Median : 6.000
## Mean :3.663 HR :3725 Mean : 6.266
## 3rd Qu.:5.000 A :3608 3rd Qu.: 8.000
## Max. :7.000 (Other):5156 Max. :11.000
## NA's :131 NA's : 131 NA's :131
## ListingCategory.num BorrowerState Occupation
## 1 :12806 CA : 3325 Other : 6786
## 7 : 4790 FL : 1768 Professional : 3452
## 2 : 2623 NY : 1639 Computer Programmer : 1261
## 3 : 2383 TX : 1562 Administrative Assistant: 959
## 6 : 1211 IL : 1389 Executive : 950
## 13 : 597 GA : 1127 (Other) :12715
## (Other): 1726 (Other):15326 NA's : 13
## EmploymentStatus EmploymentStatusDuration IsBorrowerHomeowner
## Employed :16491 Min. : 0.00 Mode :logical
## Full-time : 6634 1st Qu.: 27.00 FALSE:12814
## Self-employed: 1334 Median : 63.00 TRUE :13322
## Other : 798 Mean : 91.06
## Not employed : 375 3rd Qu.:127.00
## Retired : 273 Max. :755.00
## (Other) : 231 NA's :9
## CurrentlyInGroup GroupKey DateCreditPulled
## Mode :logical 3D4D3366260257624AB272D: 201 Min. :2009-07-13
## FALSE:24741 783C3371218786870A73D20: 134 1st Qu.:2010-10-13
## TRUE :1395 52EA3425051368132B80C96: 109 Median :2011-09-29
## B0473364376920128370B13: 63 Mean :2011-08-21
## FEF83377364176536637E50: 54 3rd Qu.:2012-06-14
## (Other) : 817 Max. :2014-02-13
## NA's :24758
## CreditScoreRangeLower CreditScoreRangeUpper FirstRecordedCreditLine
## Min. :600.0 Min. :619.0 Min. :1953-09-01
## 1st Qu.:660.0 1st Qu.:679.0 1st Qu.:1990-12-03
## Median :700.0 Median :719.0 Median :1996-04-16
## Mean :701.7 Mean :720.7 Mean :1995-04-06
## 3rd Qu.:740.0 3rd Qu.:759.0 3rd Qu.:2000-05-19
## Max. :880.0 Max. :899.0 Max. :2012-06-19
##
## CurrentCreditLines OpenCreditLines TotalCreditLinespast7years
## Min. : 0.000 Min. : 0.000 Min. : 2.0
## 1st Qu.: 6.000 1st Qu.: 5.000 1st Qu.: 16.0
## Median : 9.000 Median : 8.000 Median : 25.0
## Mean : 9.576 Mean : 8.454 Mean : 26.6
## 3rd Qu.:13.000 3rd Qu.:11.000 3rd Qu.: 35.0
## Max. :59.000 Max. :48.000 Max. :124.0
##
## OpenRevolvingAccounts OpenRevolvingMonthlyPayment InquiriesLast6Months
## Min. : 0.000 Min. : 0.0 Min. : 0.000
## 1st Qu.: 3.000 1st Qu.: 97.0 1st Qu.: 0.000
## Median : 6.000 Median : 231.0 Median : 1.000
## Mean : 6.442 Mean : 349.2 Mean : 1.188
## 3rd Qu.: 9.000 3rd Qu.: 457.0 3rd Qu.: 2.000
## Max. :47.000 Max. :5720.0 Max. :27.000
##
## TotalInquiries CurrentDelinquencies AmountDelinquent
## Min. : 0.000 Min. : 0.0000 Min. : 0.0
## 1st Qu.: 2.000 1st Qu.: 0.0000 1st Qu.: 0.0
## Median : 4.000 Median : 0.0000 Median : 0.0
## Mean : 4.646 Mean : 0.3694 Mean : 992.6
## 3rd Qu.: 6.000 3rd Qu.: 0.0000 3rd Qu.: 0.0
## Max. :74.000 Max. :32.0000 Max. :327677.0
##
## DelinquenciesLast7Years PublicRecordsLast10Years
## Min. : 0.000 Min. : 0.0000
## 1st Qu.: 0.000 1st Qu.: 0.0000
## Median : 0.000 Median : 0.0000
## Mean : 3.401 Mean : 0.2609
## 3rd Qu.: 2.000 3rd Qu.: 0.0000
## Max. :99.000 Max. :12.0000
##
## PublicRecordsLast12Months RevolvingCreditBalance BankcardUtilization
## Min. :0.00000 Min. : 0 Min. :0.0000
## 1st Qu.:0.00000 1st Qu.: 2071 1st Qu.:0.2200
## Median :0.00000 Median : 6798 Median :0.5400
## Mean :0.01144 Mean : 15210 Mean :0.5141
## 3rd Qu.:0.00000 3rd Qu.: 16600 3rd Qu.:0.8100
## Max. :4.00000 Max. :879785 Max. :2.5000
##
## AvailableBankcardCredit TotalTrades TradesNeverDelinquent.per
## Min. : 0.0 Min. : 1.00 Min. :0.1600
## 1st Qu.: 850.8 1st Qu.: 14.00 1st Qu.:0.8300
## Median : 4198.0 Median : 21.00 Median :0.9500
## Mean : 11174.3 Mean : 22.87 Mean :0.8973
## 3rd Qu.: 13414.0 3rd Qu.: 30.00 3rd Qu.:1.0000
## Max. :412785.0 Max. :122.00 Max. :1.0000
##
## TradesOpenedLast6Months DebtToIncomeRatio IncomeRange
## Min. : 0.0000 Min. : 0.0000 $25,000-49,999:8367
## 1st Qu.: 0.0000 1st Qu.: 0.1300 $50,000-74,999:7411
## Median : 0.0000 Median : 0.2000 $75,000-99,999:4041
## Mean : 0.7603 Mean : 0.2488 $100,000+ :3948
## 3rd Qu.: 1.0000 3rd Qu.: 0.3000 $1-24,999 :1964
## Max. :20.0000 Max. :10.0100 Not employed : 375
## NA's :2983 (Other) : 30
## IncomeVerifiable StatedMonthlyIncome LoanKey
## Mode :logical Min. : 0 08C43696561586194AC381C: 2
## FALSE:2976 1st Qu.: 3167 09303699897852595CD59DD: 2
## TRUE :23160 Median : 4583 114D37056655628721BD6C8: 2
## Mean : 5488 156836977849742636AE34F: 2
## 3rd Qu.: 6667 56D73700259224545E36FBC: 2
## Max. :618548 63113695530739927C7EA06: 2
## (Other) :26124
## TotalProsperLoans TotalProsperPaymentsBilled OnTimeProsperPayments
## Min. :0.000 Min. : 0.00 Min. : 0.00
## 1st Qu.:1.000 1st Qu.: 9.00 1st Qu.: 9.00
## Median :1.000 Median : 18.00 Median : 18.00
## Mean :1.401 Mean : 22.57 Mean : 21.88
## 3rd Qu.:2.000 3rd Qu.: 33.00 3rd Qu.: 32.00
## Max. :7.000 Max. :120.00 Max. :114.00
## NA's :17826 NA's :17826 NA's :17826
## ProsperPaymentsLessThanOneMonthLate ProsperPaymentsOneMonthPlusLate
## Min. : 0.000 Min. : 0.000
## 1st Qu.: 0.000 1st Qu.: 0.000
## Median : 0.000 Median : 0.000
## Mean : 0.635 Mean : 0.058
## 3rd Qu.: 0.000 3rd Qu.: 0.000
## Max. :42.000 Max. :21.000
## NA's :17826 NA's :17826
## ProsperPrincipalBorrowed ProsperPrincipalOutstanding
## Min. : 0 Min. : 0.0
## 1st Qu.: 3000 1st Qu.: 0.0
## Median : 5000 Median : 824.7
## Mean : 7394 Mean : 2127.9
## 3rd Qu.:10000 3rd Qu.: 3179.1
## Max. :60001 Max. :22586.7
## NA's :17826 NA's :17826
## ScorexChangeAtTimeOfListing LoanCurrentDaysDelinquent
## Min. :-194.00 Min. : 0.0
## 1st Qu.: -32.00 1st Qu.: 0.0
## Median : -3.00 Median : 0.0
## Mean : -0.29 Mean : 115.9
## 3rd Qu.: 29.00 3rd Qu.: 0.0
## Max. : 286.00 Max. :1593.0
## NA's :17923
## LoanFirstDefaultedCycleNumber LoanMonthsSinceOrigination LoanNumber
## Min. : 1.00 Min. : 1.00 Min. : 38045
## 1st Qu.: 9.00 1st Qu.:21.00 1st Qu.: 45089
## Median :13.00 Median :29.00 Median : 54430
## Mean :14.49 Mean :30.47 Mean : 58559
## 3rd Qu.:19.00 3rd Qu.:41.00 3rd Qu.: 68482
## Max. :41.00 Max. :56.00 Max. :132453
## NA's :19891
## LoanOriginalAmount LoanOriginationDate LoanOriginationQuarter
## Min. : 1000 Min. :2009-07-20 Q4 2011: 2352
## 1st Qu.: 3000 1st Qu.:2010-10-29 Q2 2012: 2272
## Median : 4500 Median :2011-10-12 Q1 2012: 2252
## Mean : 6365 Mean :2011-09-03 Q3 2012: 2213
## 3rd Qu.: 8000 3rd Qu.:2012-06-25 Q3 2011: 2018
## Max. :35000 Max. :2014-02-21 Q2 2011: 1713
## (Other):13316
## MemberKey MonthlyLoanPayment LP_CustomerPayments
## C70934206057523078260C7: 7 Min. : 0.0 Min. : -2.35
## E4AF3422677498955FFA00E: 7 1st Qu.: 121.6 1st Qu.: 2304.53
## 720D3508651090808DC328F: 6 Median : 175.9 Median : 4561.31
## D65B3496915385104F50CD7: 6 Mean : 232.2 Mean : 6193.82
## E48334334509567416C8C65: 6 3rd Qu.: 314.4 3rd Qu.: 8501.98
## 43DB3366978035224D7D9E3: 5 Max. :2251.5 Max. :37369.16
## (Other) :26099
## LP_CustomerPrincipalPayments LP_InterestandFees LP_ServiceFees
## Min. : 0 Min. : -2.35 Min. :-589.95
## 1st Qu.: 1795 1st Qu.: 326.71 1st Qu.: -70.74
## Median : 4000 Median : 746.15 Median : -35.07
## Mean : 5128 Mean : 1065.72 Mean : -52.18
## 3rd Qu.: 7000 3rd Qu.: 1487.20 3rd Qu.: -16.07
## Max. :35000 Max. :10013.57 Max. : 3.01
##
## LP_CollectionFees LP_GrossPrincipalLoss LP_NetPrincipalLoss
## Min. :-4865.08 Min. : -94.2 Min. : -504.4
## 1st Qu.: 0.00 1st Qu.: 0.0 1st Qu.: 0.0
## Median : 0.00 Median : 0.0 Median : 0.0
## Mean : -17.25 Mean : 1221.7 Mean : 1194.6
## 3rd Qu.: 0.00 3rd Qu.: 0.0 3rd Qu.: 0.0
## Max. : 0.00 Max. :25000.0 Max. :25000.0
##
## LP_NonPrincipalRecoverypayments PercentFunded Recommendations
## Min. : 0.00 Min. :0.700 Min. : 0.00000
## 1st Qu.: 0.00 1st Qu.:1.000 1st Qu.: 0.00000
## Median : 0.00 Median :1.000 Median : 0.00000
## Mean : 24.83 Mean :0.997 Mean : 0.03646
## 3rd Qu.: 0.00 3rd Qu.:1.000 3rd Qu.: 0.00000
## Max. :7780.03 Max. :1.000 Max. :18.00000
##
## InvestmentFromFriendsCount InvestmentFromFriendsAmount Investors
## Min. :0.00000 Min. : 0.00 Min. : 1.00
## 1st Qu.:0.00000 1st Qu.: 0.00 1st Qu.: 28.00
## Median :0.00000 Median : 0.00 Median : 62.00
## Mean :0.02124 Mean : 12.94 Mean : 92.67
## 3rd Qu.:0.00000 3rd Qu.: 0.00 3rd Qu.: 125.00
## Max. :9.00000 Max. :11000.00 Max. :1189.00
##
Here, I see that at least one loan prior to 2009 has no credit grade.
summary(filter(data,
!is.na(ClosedDate) &
is.na(CreditGrade) &
ListingCreationDate < "2009-07-01"))
## ListingKey ListingNumber ListingCreationDate
## 0385345033494662260733C: 1 Min. :149172 Min. :2007-06-08
## 04D73431953660481B1EC1D: 1 1st Qu.:306608 1st Qu.:2008-04-08
## 04F334232790941784498F1: 1 Median :339464 Median :2008-05-26
## 05153419481232978723A5F: 1 Mean :341138 Mean :2008-06-24
## 059934165217732065237C5: 1 3rd Qu.:397924 3rd Qu.:2008-09-13
## 06FF342963152332574DF05: 1 Max. :415961 Max. :2009-05-06
## (Other) :125
## CreditGrade Term LoanStatus
## NC : 0 Min. :12.00 Completed :122
## HR : 0 1st Qu.:36.00 Chargedoff : 6
## E : 0 Median :36.00 Defaulted : 3
## D : 0 Mean :35.82 Cancelled : 0
## C : 0 3rd Qu.:36.00 Current : 0
## (Other): 0 Max. :36.00 FinalPaymentInProgress: 0
## NA's :131 (Other) : 0
## ClosedDate BorrowerAPR BorrowerRate
## Min. :2010-01-28 Min. :0.06207 Min. :0.05870
## 1st Qu.:2011-04-21 1st Qu.:0.11271 1st Qu.:0.09025
## Median :2012-04-05 Median :0.17018 Median :0.14000
## Mean :2012-02-01 Mean :0.18688 Mean :0.16300
## 3rd Qu.:2012-10-29 3rd Qu.:0.25811 3rd Qu.:0.22700
## Max. :2013-10-12 Max. :0.39460 Max. :0.35300
##
## LenderYield EstimatedEffectiveYield EstimatedLoss EstimatedReturn
## Min. :0.04870 Min. : NA Min. : NA Min. : NA
## 1st Qu.:0.08025 1st Qu.: NA 1st Qu.: NA 1st Qu.: NA
## Median :0.13000 Median : NA Median : NA Median : NA
## Mean :0.15293 Mean :NaN Mean :NaN Mean :NaN
## 3rd Qu.:0.21700 3rd Qu.: NA 3rd Qu.: NA 3rd Qu.: NA
## Max. :0.34000 Max. : NA Max. : NA Max. : NA
## NA's :131 NA's :131 NA's :131
## ProsperRating.num ProsperRating.alpha ProsperScore ListingCategory.num
## Min. : NA NC : 0 Min. : NA 1 :66
## 1st Qu.: NA HR : 0 1st Qu.: NA 7 :24
## Median : NA E : 0 Median : NA 3 :17
## Mean :NaN D : 0 Mean :NaN 2 :11
## 3rd Qu.: NA C : 0 3rd Qu.: NA 6 : 7
## Max. : NA (Other): 0 Max. : NA 5 : 6
## NA's :131 NA's :131 NA's :131 (Other): 0
## BorrowerState Occupation EmploymentStatus
## CA :18 Other :30 Full-time :104
## TX :18 Professional :23 Employed : 12
## NY : 9 Analyst : 9 Part-time : 7
## IL : 7 Computer Programmer : 9 Retired : 4
## CT : 6 Administrative Assistant: 5 Self-employed: 4
## MN : 6 Teacher : 5 Not available: 0
## (Other):67 (Other) :50 (Other) : 0
## EmploymentStatusDuration IsBorrowerHomeowner CurrentlyInGroup
## Min. : 0.00 Mode :logical Mode :logical
## 1st Qu.: 26.00 FALSE:66 FALSE:107
## Median : 50.00 TRUE :65 TRUE :24
## Mean : 74.24
## 3rd Qu.:105.00
## Max. :472.00
##
## GroupKey DateCreditPulled CreditScoreRangeLower
## 783C3371218786870A73D20: 5 Min. :2009-07-13 Min. :600.0
## 020E3366126106360DB9421: 1 1st Qu.:2009-10-19 1st Qu.:660.0
## 17693364417023401A53169: 1 Median :2010-02-03 Median :720.0
## 18DA336463918236939DCE7: 1 Mean :2010-02-23 Mean :711.1
## 3D4D3366260257624AB272D: 1 3rd Qu.:2010-07-02 3rd Qu.:740.0
## (Other) : 15 Max. :2010-12-19 Max. :860.0
## NA's :107
## CreditScoreRangeUpper FirstRecordedCreditLine CurrentCreditLines
## Min. :619.0 Min. :1959-10-01 Min. : 1.00
## 1st Qu.:679.0 1st Qu.:1992-12-11 1st Qu.: 7.00
## Median :739.0 Median :1996-08-28 Median : 9.00
## Mean :730.1 Mean :1995-06-17 Mean :10.27
## 3rd Qu.:759.0 3rd Qu.:2000-04-07 3rd Qu.:13.00
## Max. :879.0 Max. :2007-09-10 Max. :35.00
##
## OpenCreditLines TotalCreditLinespast7years OpenRevolvingAccounts
## Min. : 1.000 Min. : 4.00 Min. : 0.000
## 1st Qu.: 5.000 1st Qu.:17.00 1st Qu.: 4.000
## Median : 8.000 Median :22.00 Median : 6.000
## Mean : 8.832 Mean :25.51 Mean : 6.855
## 3rd Qu.:12.000 3rd Qu.:33.00 3rd Qu.: 9.000
## Max. :29.000 Max. :58.00 Max. :29.000
##
## OpenRevolvingMonthlyPayment InquiriesLast6Months TotalInquiries
## Min. : 0.0 Min. :0.000 Min. : 0.000
## 1st Qu.: 90.5 1st Qu.:0.000 1st Qu.: 2.000
## Median : 239.0 Median :0.000 Median : 4.000
## Mean : 309.1 Mean :0.855 Mean : 5.191
## 3rd Qu.: 420.0 3rd Qu.:1.000 3rd Qu.: 8.000
## Max. :1956.0 Max. :9.000 Max. :19.000
##
## CurrentDelinquencies AmountDelinquent DelinquenciesLast7Years
## Min. :0.0000 Min. : 0.0 Min. : 0.000
## 1st Qu.:0.0000 1st Qu.: 0.0 1st Qu.: 0.000
## Median :0.0000 Median : 0.0 Median : 0.000
## Mean :0.2824 Mean : 433.7 Mean : 2.718
## 3rd Qu.:0.0000 3rd Qu.: 0.0 3rd Qu.: 0.000
## Max. :8.0000 Max. :31919.0 Max. :43.000
##
## PublicRecordsLast10Years PublicRecordsLast12Months RevolvingCreditBalance
## Min. :0.0000 Min. :0 Min. : 0
## 1st Qu.:0.0000 1st Qu.:0 1st Qu.: 2308
## Median :0.0000 Median :0 Median : 8074
## Mean :0.1756 Mean :0 Mean :12039
## 3rd Qu.:0.0000 3rd Qu.:0 3rd Qu.:16422
## Max. :3.0000 Max. :0 Max. :97290
##
## BankcardUtilization AvailableBankcardCredit TotalTrades
## Min. :0.0000 Min. : 0 Min. : 3.00
## 1st Qu.:0.1800 1st Qu.: 1557 1st Qu.:14.50
## Median :0.4400 Median : 6999 Median :19.00
## Mean :0.4524 Mean : 13522 Mean :22.21
## 3rd Qu.:0.7200 3rd Qu.: 17470 3rd Qu.:29.00
## Max. :0.9900 Max. :110117 Max. :52.00
##
## TradesNeverDelinquent.per TradesOpenedLast6Months DebtToIncomeRatio
## Min. :0.3000 Min. :0.0000 Min. :0.0200
## 1st Qu.:0.8400 1st Qu.:0.0000 1st Qu.:0.1100
## Median :0.9600 Median :0.0000 Median :0.2000
## Mean :0.8996 Mean :0.5725 Mean :0.2500
## 3rd Qu.:1.0000 3rd Qu.:1.0000 3rd Qu.:0.2725
## Max. :1.0000 Max. :5.0000 Max. :5.5900
## NA's :11
## IncomeRange IncomeVerifiable StatedMonthlyIncome
## $50,000-74,999:45 Mode :logical Min. : 212.8
## $25,000-49,999:40 FALSE:11 1st Qu.: 3333.3
## $75,000-99,999:17 TRUE :120 Median : 4616.7
## $100,000+ :16 Mean : 5111.2
## $1-24,999 :13 3rd Qu.: 6375.0
## Not displayed : 0 Max. :20833.3
## (Other) : 0
## LoanKey TotalProsperLoans
## 003C35735230494626ADB02: 1 Min. :1.000
## 02CA35638190585257E0D22: 1 1st Qu.:1.000
## 030B35936026115966F4EA0: 1 Median :1.000
## 032A357638786716375DFFB: 1 Mean :1.153
## 040235782802629332A0C8C: 1 3rd Qu.:1.000
## 05BC35722810324548A02FE: 1 Max. :3.000
## (Other) :125 NA's :72
## TotalProsperPaymentsBilled OnTimeProsperPayments
## Min. : 1.00 Min. : 0.00
## 1st Qu.:14.50 1st Qu.:14.50
## Median :24.00 Median :22.00
## Mean :22.76 Mean :22.54
## 3rd Qu.:34.00 3rd Qu.:33.50
## Max. :42.00 Max. :41.00
## NA's :72 NA's :72
## ProsperPaymentsLessThanOneMonthLate ProsperPaymentsOneMonthPlusLate
## Min. :0.0000 Min. :0
## 1st Qu.:0.0000 1st Qu.:0
## Median :0.0000 Median :0
## Mean :0.2203 Mean :0
## 3rd Qu.:0.0000 3rd Qu.:0
## Max. :3.0000 Max. :0
## NA's :72 NA's :72
## ProsperPrincipalBorrowed ProsperPrincipalOutstanding
## Min. : 1000 Min. : 0.00
## 1st Qu.: 1775 1st Qu.: 0.00
## Median : 4500 Median : 0.00
## Mean : 5491 Mean : 428.24
## 3rd Qu.: 7500 3rd Qu.: 0.25
## Max. :27000 Max. :5788.52
## NA's :72 NA's :72
## ScorexChangeAtTimeOfListing LoanCurrentDaysDelinquent
## Min. :-50.00 Min. : 0.00
## 1st Qu.: -7.00 1st Qu.: 0.00
## Median : 39.00 Median : 0.00
## Mean : 43.37 Mean : 53.65
## 3rd Qu.: 83.00 3rd Qu.: 0.00
## Max. :215.00 Max. :1257.00
## NA's :74
## LoanFirstDefaultedCycleNumber LoanMonthsSinceOrigination LoanNumber
## Min. :10.00 Min. :39.00 Min. :38046
## 1st Qu.:18.00 1st Qu.:44.00 1st Qu.:39344
## Median :23.00 Median :49.00 Median :40869
## Mean :24.22 Mean :48.34 Mean :41386
## 3rd Qu.:32.00 3rd Qu.:52.00 3rd Qu.:43474
## Max. :37.00 Max. :56.00 Max. :46378
## NA's :122
## LoanOriginalAmount LoanOriginationDate LoanOriginationQuarter
## Min. : 1000 Min. :2009-07-22 Q4 2009:32
## 1st Qu.: 2000 1st Qu.:2009-11-08 Q3 2009:26
## Median : 3000 Median :2010-02-17 Q2 2010:21
## Mean : 4187 Mean :2010-03-11 Q4 2010:21
## 3rd Qu.: 5000 3rd Qu.:2010-07-18 Q1 2010:17
## Max. :15000 Max. :2010-12-30 Q3 2010:14
## (Other): 0
## MemberKey MonthlyLoanPayment LP_CustomerPayments
## 010B33941340101099BFE47: 1 Min. : 0.00 Min. : 458.2
## 016533808792025682035EE: 1 1st Qu.: 63.24 1st Qu.: 2161.4
## 0CCD3420393708396FB7287: 1 Median :111.95 Median : 3865.5
## 0F1733815422230679CFC01: 1 Mean :146.00 Mean : 4865.0
## 0F5133834635103374519DF: 1 3rd Qu.:188.66 3rd Qu.: 6402.7
## 10D73380714543112C251DF: 1 Max. :578.69 Max. :18748.2
## (Other) :125
## LP_CustomerPrincipalPayments LP_InterestandFees LP_ServiceFees
## Min. : 204.8 Min. : 11.26 Min. :-242.93
## 1st Qu.: 1946.1 1st Qu.: 254.88 1st Qu.: -62.53
## Median : 3000.0 Median : 546.00 Median : -38.67
## Mean : 4043.8 Mean : 821.17 Mean : -50.11
## 3rd Qu.: 5000.0 3rd Qu.:1143.52 3rd Qu.: -19.86
## Max. :15000.0 Max. :3748.19 Max. : -1.41
##
## LP_CollectionFees LP_GrossPrincipalLoss LP_NetPrincipalLoss
## Min. :0 Min. : 0.0 Min. : 0.0
## 1st Qu.:0 1st Qu.: 0.0 1st Qu.: 0.0
## Median :0 Median : 0.0 Median : 0.0
## Mean :0 Mean : 145.4 Mean : 145.4
## 3rd Qu.:0 3rd Qu.: 0.0 3rd Qu.: 0.0
## Max. :0 Max. :8911.2 Max. :8911.2
##
## LP_NonPrincipalRecoverypayments PercentFunded Recommendations
## Min. :0 Min. :1 Min. :0.00000
## 1st Qu.:0 1st Qu.:1 1st Qu.:0.00000
## Median :0 Median :1 Median :0.00000
## Mean :0 Mean :1 Mean :0.08397
## 3rd Qu.:0 3rd Qu.:1 3rd Qu.:0.00000
## Max. :0 Max. :1 Max. :2.00000
##
## InvestmentFromFriendsCount InvestmentFromFriendsAmount Investors
## Min. :0.00000 Min. : 0.00 Min. : 10.0
## 1st Qu.:0.00000 1st Qu.: 0.00 1st Qu.: 75.5
## Median :0.00000 Median : 0.00 Median :124.0
## Mean :0.03817 Mean : 57.97 Mean :155.5
## 3rd Qu.:0.00000 3rd Qu.: 0.00 3rd Qu.:204.0
## Max. :1.00000 Max. :5140.00 Max. :594.0
##
I ultimately see that 130 loans are missing a credit grade for no apparent reason. I don’t see any pattern here, and assume that it is impossible right now for me to tell why this data is missing. However, this is a relatively small amount of data.
I am otherwise assuming that CreditGrade was effectively replaced by ProsperRating 2009, and that these can be used more-or-less interchangeably, particularly given that their labels correspond.
Next, I notice that only about half of the closed loans have estimated effective lender yields or several other estimates of yield/loss, although they are not closed. I assume these are pre-July 2009 listings, but I want to take a closer look at them.
summary(filter(data, !is.na(ClosedDate) &
is.na(EstimatedEffectiveYield)))
## ListingKey ListingNumber ListingCreationDate
## 00033425227988088FA6752: 1 Min. : 4 Min. :2005-11-09
## 000433785890431972B4743: 1 1st Qu.: 92588 1st Qu.:2007-02-02
## 00083422661625108817246: 1 Median :199844 Median :2007-09-10
## 000A34209897973969CFA81: 1 Mean :201960 Mean :2007-08-26
## 000D3410451511356B08F17: 1 3rd Qu.:314319 3rd Qu.:2008-04-19
## 00143395229257559A91663: 1 Max. :415961 Max. :2009-05-06
## (Other) :29078
## CreditGrade Term LoanStatus
## C :5649 Min. :12 Completed :18410
## D :5153 1st Qu.:36 Chargedoff : 6656
## B :4389 Median :36 Defaulted : 4013
## AA :3509 Mean :36 Cancelled : 5
## HR :3508 3rd Qu.:36 Current : 0
## (Other):6745 Max. :36 FinalPaymentInProgress: 0
## NA's : 131 (Other) : 0
## ClosedDate BorrowerAPR BorrowerRate LenderYield
## Min. :2005-11-25 Min. :0.00653 Min. :0.0000 Min. :-0.0100
## 1st Qu.:2008-08-25 1st Qu.:0.13705 1st Qu.:0.1269 1st Qu.: 0.1170
## Median :2009-08-17 Median :0.18224 Median :0.1700 Median : 0.1600
## Mean :2009-07-30 Mean :0.19596 Mean :0.1833 Mean : 0.1730
## 3rd Qu.:2010-07-29 3rd Qu.:0.24753 3rd Qu.:0.2364 3rd Qu.: 0.2224
## Max. :2013-10-12 Max. :0.51229 Max. :0.4975 Max. : 0.4925
## NA's :25
## EstimatedEffectiveYield EstimatedLoss EstimatedReturn ProsperRating.num
## Min. : NA Min. : NA Min. : NA Min. : NA
## 1st Qu.: NA 1st Qu.: NA 1st Qu.: NA 1st Qu.: NA
## Median : NA Median : NA Median : NA Median : NA
## Mean :NaN Mean :NaN Mean :NaN Mean :NaN
## 3rd Qu.: NA 3rd Qu.: NA 3rd Qu.: NA 3rd Qu.: NA
## Max. : NA Max. : NA Max. : NA Max. : NA
## NA's :29084 NA's :29084 NA's :29084 NA's :29084
## ProsperRating.alpha ProsperScore ListingCategory.num BorrowerState
## NC : 0 Min. : NA 0 :16945 CA : 3956
## HR : 0 1st Qu.: NA 1 : 5128 GA : 1661
## E : 0 Median : NA 4 : 2395 IL : 1657
## D : 0 Mean :NaN 3 : 1891 FL : 1314
## C : 0 3rd Qu.: NA 7 : 1276 TX : 1208
## (Other): 0 Max. : NA 2 : 632 (Other):13773
## NA's :29084 NA's :29084 (Other): 817 NA's : 5515
## Occupation EmploymentStatus
## Other : 7300 Full-time :18428
## Professional : 3086 Not available: 5347
## Computer Programmer: 1242 Self-employed: 1596
## Sales - Commission : 1096 Part-time : 832
## Clerical : 1048 Retired : 428
## (Other) :13057 (Other) : 198
## NA's : 2255 NA's : 2255
## EmploymentStatusDuration IsBorrowerHomeowner CurrentlyInGroup
## Min. : 0.00 Mode :logical Mode :logical
## 1st Qu.: 15.00 FALSE:16454 FALSE:18611
## Median : 40.00 TRUE :12630 TRUE :10473
## Mean : 68.49
## 3rd Qu.: 94.00
## Max. :623.00
## NA's :7606
## GroupKey DateCreditPulled
## 783C3371218786870A73D20: 932 Min. :2005-11-09
## 6A3B336601725506917317E: 619 1st Qu.:2007-01-30
## 3D4D3366260257624AB272D: 606 Median :2007-09-04
## FEF83377364176536637E50: 529 Mean :2007-08-24
## C9643379247860156A00EC0: 342 3rd Qu.:2008-04-17
## (Other) : 8287 Max. :2010-12-19
## NA's :17769
## CreditScoreRangeLower CreditScoreRangeUpper FirstRecordedCreditLine
## Min. : 0.0 Min. : 19.0 Min. :1947-08-24
## 1st Qu.:600.0 1st Qu.:619.0 1st Qu.:1990-07-26
## Median :640.0 Median :659.0 Median :1995-06-01
## Mean :644.4 Mean :663.4 Mean :1994-08-07
## 3rd Qu.:700.0 3rd Qu.:719.0 3rd Qu.:1999-08-31
## Max. :880.0 Max. :899.0 Max. :2008-07-01
## NA's :591 NA's :591 NA's :697
## CurrentCreditLines OpenCreditLines TotalCreditLinespast7years
## Min. : 0.000 Min. : 0.0 Min. : 2.00
## 1st Qu.: 5.000 1st Qu.: 4.0 1st Qu.: 13.00
## Median : 9.000 Median : 7.0 Median : 22.00
## Mean : 9.563 Mean : 8.2 Mean : 24.06
## 3rd Qu.:13.000 3rd Qu.:11.0 3rd Qu.: 32.00
## Max. :52.000 Max. :51.0 Max. :136.00
## NA's :7604 NA's :7604 NA's :697
## OpenRevolvingAccounts OpenRevolvingMonthlyPayment InquiriesLast6Months
## Min. : 0.000 Min. : 0.0 Min. : 0.000
## 1st Qu.: 2.000 1st Qu.: 35.0 1st Qu.: 0.000
## Median : 5.000 Median : 139.0 Median : 2.000
## Mean : 5.755 Mean : 303.7 Mean : 2.841
## 3rd Qu.: 8.000 3rd Qu.: 374.0 3rd Qu.: 4.000
## Max. :51.000 Max. :14985.0 Max. :105.000
## NA's :697
## TotalInquiries CurrentDelinquencies AmountDelinquent
## Min. : 0.000 Min. : 0.000 Min. : 0
## 1st Qu.: 3.000 1st Qu.: 0.000 1st Qu.: 0
## Median : 7.000 Median : 0.000 Median : 0
## Mean : 9.516 Mean : 1.398 Mean : 1118
## 3rd Qu.: 13.000 3rd Qu.: 1.000 3rd Qu.: 30
## Max. :379.000 Max. :83.000 Max. :444745
## NA's :1159 NA's :697 NA's :7622
## DelinquenciesLast7Years PublicRecordsLast10Years
## Min. : 0.000 Min. : 0.0000
## 1st Qu.: 0.000 1st Qu.: 0.0000
## Median : 0.000 Median : 0.0000
## Mean : 5.652 Mean : 0.3949
## 3rd Qu.: 6.000 3rd Qu.: 1.0000
## Max. :99.000 Max. :30.0000
## NA's :990 NA's :697
## PublicRecordsLast12Months RevolvingCreditBalance BankcardUtilization
## Min. :0.000 Min. : 0 Min. :0.00
## 1st Qu.:0.000 1st Qu.: 1192 1st Qu.:0.20
## Median :0.000 Median : 5206 Median :0.60
## Mean :0.039 Mean : 16250 Mean :0.55
## 3rd Qu.:0.000 3rd Qu.: 15590 3rd Qu.:0.88
## Max. :7.000 Max. :1435667 Max. :5.95
## NA's :7604 NA's :7604 NA's :7604
## AvailableBankcardCredit TotalTrades TradesNeverDelinquent.per
## Min. : 0 Min. : 0.00 Min. :0.000
## 1st Qu.: 253 1st Qu.: 11.00 1st Qu.:0.690
## Median : 2277 Median : 18.00 Median :0.870
## Mean : 10460 Mean : 20.48 Mean :0.807
## 3rd Qu.: 10162 3rd Qu.: 28.00 3rd Qu.:1.000
## Max. :646285 Max. :126.00 Max. :1.000
## NA's :7544 NA's :7544 NA's :7544
## TradesOpenedLast6Months DebtToIncomeRatio IncomeRange
## Min. : 0.000 Min. : 0.0000 $25,000-49,999:8017
## 1st Qu.: 0.000 1st Qu.: 0.1200 Not displayed :7741
## Median : 1.000 Median : 0.2000 $50,000-74,999:5423
## Mean : 1.088 Mean : 0.3239 $1-24,999 :2620
## 3rd Qu.: 2.000 3rd Qu.: 0.3000 $75,000-99,999:2418
## Max. :17.000 Max. :10.0100 $100,000+ :2132
## NA's :7544 NA's :1258 (Other) : 733
## IncomeVerifiable StatedMonthlyIncome LoanKey
## Mode :logical Min. : 0 00013421083473792D70F75: 1
## FALSE:1336 1st Qu.: 2500 000534180797040005C07AA: 1
## TRUE :27748 Median : 3833 00093413855467649508680: 1
## Mean : 4665 000B3366346245964D6187E: 1
## 3rd Qu.: 5752 000B34179327090460D3429: 1
## Max. :208333 000E3392089465002A7DBA0: 1
## (Other) :29078
## TotalProsperLoans TotalProsperPaymentsBilled OnTimeProsperPayments
## Min. :1.000 Min. : 0.00 Min. : 0.00
## 1st Qu.:1.000 1st Qu.: 7.00 1st Qu.: 6.00
## Median :1.000 Median :10.00 Median :10.00
## Mean :1.079 Mean :11.09 Mean :10.87
## 3rd Qu.:1.000 3rd Qu.:14.00 3rd Qu.:14.00
## Max. :5.000 Max. :42.00 Max. :41.00
## NA's :26796 NA's :26796 NA's :26796
## ProsperPaymentsLessThanOneMonthLate ProsperPaymentsOneMonthPlusLate
## Min. :0.000 Min. :0.000
## 1st Qu.:0.000 1st Qu.:0.000
## Median :0.000 Median :0.000
## Mean :0.205 Mean :0.011
## 3rd Qu.:0.000 3rd Qu.:0.000
## Max. :7.000 Max. :5.000
## NA's :26796 NA's :26796
## ProsperPrincipalBorrowed ProsperPrincipalOutstanding
## Min. : 1000 Min. : 0
## 1st Qu.: 2550 1st Qu.: 0
## Median : 4500 Median : 1970
## Mean : 6012 Mean : 3027
## 3rd Qu.: 7500 3rd Qu.: 4145
## Max. :40000 Max. :21862
## NA's :26796 NA's :26796
## ScorexChangeAtTimeOfListing LoanCurrentDaysDelinquent
## Min. :-160.000 Min. : 0.0
## 1st Qu.: 0.000 1st Qu.: 0.0
## Median : 0.000 Median : 0.0
## Mean : 7.363 Mean : 491.8
## 3rd Qu.: 40.000 3rd Qu.: 948.2
## Max. : 215.000 Max. :2704.0
## NA's :26798
## LoanFirstDefaultedCycleNumber LoanMonthsSinceOrigination LoanNumber
## Min. : 0.00 Min. : 39.00 Min. : 1
## 1st Qu.:10.00 1st Qu.: 70.00 1st Qu.: 7395
## Median :16.00 Median : 78.00 Median :19450
## Mean :17.32 Mean : 78.21 Mean :19418
## 3rd Qu.:24.00 3rd Qu.: 85.00 3rd Qu.:30463
## Max. :44.00 Max. :100.00 Max. :46378
## NA's :18376
## LoanOriginalAmount LoanOriginationDate LoanOriginationQuarter
## Min. : 1000 Min. :2005-11-15 Q2 2008:4344
## 1st Qu.: 2500 1st Qu.:2007-02-13 Q3 2008:3602
## Median : 4500 Median :2007-09-21 Q2 2007:3118
## Mean : 6159 Mean :2007-09-09 Q1 2007:3079
## 3rd Qu.: 7904 3rd Qu.:2008-05-02 Q1 2008:3074
## Max. :25000 Max. :2010-12-30 Q3 2007:2671
## (Other):9196
## MemberKey MonthlyLoanPayment LP_CustomerPayments
## 3EF133647645155044BFFD9: 6 Min. : 0.00 Min. : 0
## 7E1733653050264822FAA3D: 6 1st Qu.: 84.84 1st Qu.: 1647
## 16083364744933457E57FB9: 4 Median : 153.80 Median : 3778
## 242A33660960718280E1642: 4 Mean : 215.72 Mean : 5683
## 5B8333756488098823F5EFE: 4 3rd Qu.: 275.77 3rd Qu.: 7403
## 63CA34120866140639431C9: 4 Max. :1130.90 Max. :40702
## (Other) :29056
## LP_CustomerPrincipalPayments LP_InterestandFees LP_ServiceFees
## Min. : 0 Min. : 0.0 Min. :-664.87
## 1st Qu.: 1069 1st Qu.: 335.4 1st Qu.: -76.15
## Median : 3000 Median : 779.3 Median : -33.50
## Mean : 4502 Mean : 1180.7 Mean : -54.97
## 3rd Qu.: 6000 3rd Qu.: 1532.2 3rd Qu.: -13.14
## Max. :25693 Max. :15617.0 Max. : 32.06
##
## LP_CollectionFees LP_GrossPrincipalLoss LP_NetPrincipalLoss
## Min. :-9274.75 Min. : 0 Min. : -954.5
## 1st Qu.: 0.00 1st Qu.: 0 1st Qu.: 0.0
## Median : 0.00 Median : 0 Median : 0.0
## Mean : -31.86 Mean : 1647 Mean : 1596.6
## 3rd Qu.: 0.00 3rd Qu.: 1863 3rd Qu.: 1748.7
## Max. : 0.00 Max. :25000 Max. :25000.0
##
## LP_NonPrincipalRecoverypayments PercentFunded Recommendations
## Min. : 0.00 Min. :1.000 Min. : 0.0000
## 1st Qu.: 0.00 1st Qu.:1.000 1st Qu.: 0.0000
## Median : 0.00 Median :1.000 Median : 0.0000
## Mean : 76.19 Mean :1.000 Mean : 0.1369
## 3rd Qu.: 0.00 3rd Qu.:1.000 3rd Qu.: 0.0000
## Max. :21117.90 Max. :1.011 Max. :39.0000
##
## InvestmentFromFriendsCount InvestmentFromFriendsAmount Investors
## Min. : 0.00000 Min. : 0.00 Min. : 1.0
## 1st Qu.: 0.00000 1st Qu.: 0.00 1st Qu.: 34.0
## Median : 0.00000 Median : 0.00 Median : 78.0
## Mean : 0.06842 Mean : 52.25 Mean :116.1
## 3rd Qu.: 0.00000 3rd Qu.: 0.00 3rd Qu.:158.0
## Max. :33.00000 Max. :25000.00 Max. :913.0
##
That is indeed the case, and as the same percentage of the other similar measures (which were also assigned only starting in 2009) is missing, I will assume this is also the case for those measures.
I see that some borrower demographic, employment, and previous credit information is missing, but I assume that this is simply missing data, with no larger story behind it, particularly as this is a relatively small percentage of loans. I also see that more of this information is missing for loans that have been closed, which suggests to me that this data may have been either lost, or not gathered as thoroughly in the past.
The majority of the borrowers in both categories have no prior Prosper history, and it would be interesting to see if, for example, not having any Prosper history leads to more chargeoffs or delinquencies than having positive Prosper history.
Most loans were not charged off or defaulted, but about 30% of closed loans at least at some point became delinquent (LoanFirstDefaultedCycleNumber). A very small number of open loans are delinquent.
At this point, I want to take a look, through plotting correlations between variables, at how predictive the above background, financial, or demographic measures are of those measures most closely related to lender profit.
LoanStatus vs. CreditGrade/ProsperRatingIn the case of LoanStatus, as this is not a quantitative or clearly ordered factor, it may make sense to at least visually organize some of the labels. I therefore ‘group’ all Past Due levels together, and order the labels loosely in terms of ‘goodness’ - assuming that being on time, or having paid off the loan, is ‘good,’ and that having defaulted, or having the loan charged off, is ‘bad.’ I group CreditGrade and ProsperRating into one measure (Rating), and then plot LoanStatus against this new rating, to see if there are any obvious patterns, in terms of how likely one is to have a particular loan status, given a particular rating:
What I see here is that the higher the rating, the greater the likelihood that the loan is either completed or current, and the less the likelihood that it is past due, charged off, or defaulted. Overall, it seems that a customer with a higher Prosper rating at the time the loan is posted will indeed be more likely to pay off a loan in the future. I will look at this plot more closely in the final section.
First, I want to get a sense of when these measures might be getting assigned, in cases where documentation does not make this clear. I will look at loans which have not been closed, to see see if they systematically include this information (in addition to loans which have been closed). If they do, it’s relatively safe to say that these measures are predictions, rather than reports of actual yield.
summary(filter(data, is.na(ClosedDate)))
## ListingKey ListingNumber ListingCreationDate
## 17A93590655669644DB4C06: 6 Min. : 464139 Min. :2010-06-24
## 349D3587495831350F0F648: 4 1st Qu.: 682358 1st Qu.:2012-12-04
## 47C1359638497431975670B: 4 Median : 875238 Median :2013-08-20
## 8474358854651984137201C: 4 Mean : 870182 Mean :2013-05-16
## DE8535960513435199406CE: 4 3rd Qu.:1051465 3rd Qu.:2013-12-05
## 04C13599434217079754AEE: 3 Max. :1255725 Max. :2014-03-10
## (Other) :58823
## CreditGrade Term LoanStatus
## NC : 0 Min. :12.00 Current :56576
## HR : 0 1st Qu.:36.00 Past Due (1-15 days) : 806
## E : 0 Median :36.00 Past Due (31-60 days) : 363
## D : 0 Mean :44.47 Past Due (61-90 days) : 313
## C : 0 3rd Qu.:60.00 Past Due (91-120 days): 304
## (Other): 0 Max. :60.00 Past Due (16-30 days) : 265
## NA's :58848 (Other) : 221
## ClosedDate BorrowerAPR BorrowerRate LenderYield
## Min. :NA Min. :0.06106 Min. :0.0577 Min. :0.0477
## 1st Qu.:NA 1st Qu.:0.16056 1st Qu.:0.1334 1st Qu.:0.1234
## Median :NA Median :0.20679 Median :0.1769 Median :0.1669
## Mean :NA Mean :0.21568 Mean :0.1856 Mean :0.1756
## 3rd Qu.:NA 3rd Qu.:0.26877 3rd Qu.:0.2346 3rd Qu.:0.2246
## Max. :NA Max. :0.38486 Max. :0.3435 Max. :0.3335
## NA's :58848
## EstimatedEffectiveYield EstimatedLoss EstimatedReturn
## Min. :0.0474 Min. :0.00490 Min. :0.03700
## 1st Qu.:0.1181 1st Qu.:0.04200 1st Qu.:0.07400
## Median :0.1575 Median :0.06490 Median :0.08728
## Mean :0.1653 Mean :0.07435 Mean :0.09100
## 3rd Qu.:0.2086 3rd Qu.:0.10250 3rd Qu.:0.10790
## Max. :0.3057 Max. :0.20300 Max. :0.17610
##
## ProsperRating.num ProsperRating.alpha ProsperScore ListingCategory.num
## Min. :1.000 C :14528 Min. : 1.00 1 :40440
## 1st Qu.:3.000 B :12208 1st Qu.: 4.00 7 : 4452
## Median :4.000 A :10943 Median : 6.00 2 : 4189
## Mean :4.253 D : 8405 Mean : 5.81 3 : 2932
## 3rd Qu.:5.000 E : 5965 3rd Qu.: 8.00 13 : 1399
## Max. :7.000 AA : 3589 Max. :11.00 15 : 1152
## (Other): 3210 (Other): 4284
## BorrowerState Occupation EmploymentStatus
## CA : 7454 Other :14561 Employed :50831
## NY : 4214 Professional : 7113 Self-employed: 3208
## TX : 4090 Executive : 2522 Other : 3008
## FL : 3642 Teacher : 2111 Full-time : 1397
## IL : 2882 Computer Programmer: 1984 Not employed : 274
## OH : 2389 (Other) :29237 Retired : 98
## (Other):34177 NA's : 1320 (Other) : 32
## EmploymentStatusDuration IsBorrowerHomeowner CurrentlyInGroup
## Min. : 0.0 Mode :logical Mode :logical
## 1st Qu.: 32.0 FALSE:27257 FALSE:57973
## Median : 79.0 TRUE :31591 TRUE :875
## Mean :108.3
## 3rd Qu.:156.0
## Max. :733.0
## NA's :10
## GroupKey DateCreditPulled
## 3D4D3366260257624AB272D: 110 Min. :2008-01-23
## 783C3371218786870A73D20: 79 1st Qu.:2012-12-03
## 52EA3425051368132B80C96: 41 Median :2013-08-22
## FEF83377364176536637E50: 29 Mean :2013-05-17
## 6A3B336601725506917317E: 26 3rd Qu.:2013-12-05
## (Other) : 387 Max. :2014-03-10
## NA's :58176
## CreditScoreRangeLower CreditScoreRangeUpper FirstRecordedCreditLine
## Min. :600.0 Min. :619.0 Min. :1951-01-01
## 1st Qu.:660.0 1st Qu.:679.0 1st Qu.:1990-03-01
## Median :700.0 Median :719.0 Median :1995-11-22
## Mean :698.4 Mean :717.4 Mean :1994-11-04
## 3rd Qu.:720.0 3rd Qu.:739.0 3rd Qu.:2000-05-11
## Max. :880.0 Max. :899.0 Max. :2012-12-22
##
## CurrentCreditLines OpenCreditLines TotalCreditLinespast7years
## Min. : 0.00 Min. : 0 Min. : 2.00
## 1st Qu.: 7.00 1st Qu.: 7 1st Qu.: 19.00
## Median :10.00 Median : 9 Median : 27.00
## Mean :10.92 Mean :10 Mean : 28.12
## 3rd Qu.:14.00 3rd Qu.:13 3rd Qu.: 36.00
## Max. :54.00 Max. :54 Max. :125.00
##
## OpenRevolvingAccounts OpenRevolvingMonthlyPayment InquiriesLast6Months
## Min. : 0.000 Min. : 0.0 Min. : 0.0000
## 1st Qu.: 5.000 1st Qu.: 188.0 1st Qu.: 0.0000
## Median : 7.000 Median : 344.0 Median : 0.0000
## Mean : 7.805 Mean : 466.6 Mean : 0.8649
## 3rd Qu.:10.000 3rd Qu.: 606.0 3rd Qu.: 1.0000
## Max. :50.000 Max. :13765.0 Max. :15.0000
##
## TotalInquiries CurrentDelinquencies AmountDelinquent
## Min. : 0.000 Min. : 0.0000 Min. : 0
## 1st Qu.: 2.000 1st Qu.: 0.0000 1st Qu.: 0
## Median : 3.000 Median : 0.0000 Median : 0
## Mean : 4.134 Mean : 0.3015 Mean : 931
## 3rd Qu.: 6.000 3rd Qu.: 0.0000 3rd Qu.: 0
## Max. :78.000 Max. :51.0000 Max. :463881
##
## DelinquenciesLast7Years PublicRecordsLast10Years
## Min. : 0.000 Min. : 0.0000
## 1st Qu.: 0.000 1st Qu.: 0.0000
## Median : 0.000 Median : 0.0000
## Mean : 3.772 Mean : 0.2956
## 3rd Qu.: 2.000 3rd Qu.: 0.0000
## Max. :99.000 Max. :38.0000
##
## PublicRecordsLast12Months RevolvingCreditBalance BankcardUtilization
## Min. : 0.00000 Min. : 0 Min. :0.0000
## 1st Qu.: 0.00000 1st Qu.: 4736 1st Qu.:0.3700
## Median : 0.00000 Median : 10388 Median :0.6200
## Mean : 0.00814 Mean : 19140 Mean :0.5862
## 3rd Qu.: 0.00000 3rd Qu.: 21972 3rd Qu.:0.8300
## Max. :20.00000 Max. :999165 Max. :1.8200
##
## AvailableBankcardCredit TotalTrades TradesNeverDelinquent.per
## Min. : 0 Min. : 1.0 Min. :0.0800
## 1st Qu.: 1296 1st Qu.: 16.0 1st Qu.:0.8500
## Median : 4727 Median : 23.0 Median :0.9600
## Mean : 11506 Mean : 24.4 Mean :0.9097
## 3rd Qu.: 14111 3rd Qu.: 31.0 3rd Qu.:1.0000
## Max. :498374 Max. :108.0 Max. :1.0000
##
## TradesOpenedLast6Months DebtToIncomeRatio IncomeRange
## Min. : 0.0000 Min. : 0.000 $50,000-74,999:18261
## 1st Qu.: 0.0000 1st Qu.: 0.160 $25,000-49,999:15848
## Median : 0.0000 Median : 0.230 $100,000+ :11273
## Mean : 0.7159 Mean : 0.263 $75,000-99,999:10474
## 3rd Qu.: 1.0000 3rd Qu.: 0.320 $1-24,999 : 2703
## Max. :16.0000 Max. :10.010 Not employed : 274
## NA's :4324 (Other) : 15
## IncomeVerifiable StatedMonthlyIncome LoanKey
## Mode :logical Min. : 0 CB1B37030986463208432A1: 6
## FALSE:4368 1st Qu.: 3617 2DEE3698211017519D7333F: 4
## TRUE :54480 Median : 5167 9F4B37043517554537C364C: 4
## Mean : 6126 D895370150591392337ED6D: 4
## 3rd Qu.: 7417 E6FB37073953690388BC56D: 4
## Max. :1750003 0D8F37036734373301ED419: 3
## (Other) :58823
## TotalProsperLoans TotalProsperPaymentsBilled OnTimeProsperPayments
## Min. :1.0 Min. : 0.00 Min. : 0.00
## 1st Qu.:1.0 1st Qu.: 10.00 1st Qu.: 10.00
## Median :1.0 Median : 17.00 Median : 17.00
## Mean :1.5 Mean : 25.54 Mean : 24.81
## 3rd Qu.:2.0 3rd Qu.: 35.00 3rd Qu.: 35.00
## Max. :8.0 Max. :141.00 Max. :141.00
## NA's :47302 NA's :47302 NA's :47302
## ProsperPaymentsLessThanOneMonthLate ProsperPaymentsOneMonthPlusLate
## Min. : 0.00 Min. : 0.00
## 1st Qu.: 0.00 1st Qu.: 0.00
## Median : 0.00 Median : 0.00
## Mean : 0.68 Mean : 0.05
## 3rd Qu.: 0.00 3rd Qu.: 0.00
## Max. :42.00 Max. :21.00
## NA's :47302 NA's :47302
## ProsperPrincipalBorrowed ProsperPrincipalOutstanding
## Min. : 1000 Min. : 0.00
## 1st Qu.: 4000 1st Qu.: 0.01
## Median : 7400 Median : 2213.24
## Mean : 9721 Mean : 3475.83
## 3rd Qu.:13500 3rd Qu.: 5204.00
## Max. :72499 Max. :23450.95
## NA's :47302 NA's :47302
## ScorexChangeAtTimeOfListing LoanCurrentDaysDelinquent
## Min. :-209.0 Min. : 0.000
## 1st Qu.: -38.0 1st Qu.: 0.000
## Median : -9.0 Median : 0.000
## Mean : -8.6 Mean : 1.468
## 3rd Qu.: 18.0 3rd Qu.: 0.000
## Max. : 220.0 Max. :129.000
## NA's :50362
## LoanFirstDefaultedCycleNumber LoanMonthsSinceOrigination LoanNumber
## Min. : 1.00 Min. : 0.00 Min. : 43212
## 1st Qu.: 1.00 1st Qu.: 3.00 1st Qu.: 79386
## Median : 7.50 Median : 7.00 Median :100276
## Mean :11.88 Mean : 9.68 Mean : 98941
## 3rd Qu.:17.25 3rd Qu.:15.00 3rd Qu.:121614
## Max. :38.00 Max. :45.00 Max. :136486
## NA's :58840
## LoanOriginalAmount LoanOriginationDate LoanOriginationQuarter
## Min. : 1500 Min. :2010-06-30 Q4 2013:14058
## 1st Qu.: 4000 1st Qu.:2012-12-18 Q1 2014:12103
## Median :10000 Median :2013-08-29 Q3 2013: 8592
## Mean :10280 Mean :2013-05-27 Q2 2013: 6268
## 3rd Qu.:15000 3rd Qu.:2013-12-16 Q3 2012: 3419
## Max. :35000 Max. :2014-03-12 Q4 2012: 3022
## (Other):11386
## MemberKey MonthlyLoanPayment LP_CustomerPayments
## F80D3694083622957BA09F2: 6 Min. : 0.0 Min. : 0
## 0F0C35762146892131F3BB4: 4 1st Qu.: 166.6 1st Qu.: 555
## 22B53699795042922A27DCC: 4 Median : 286.9 Median : 1516
## 61E93477058090904D07D4F: 4 Mean : 318.1 Mean : 2550
## 946A35068649687154063A9: 4 3rd Qu.: 415.1 3rd Qu.: 3367
## EA463494084516244B9C542: 4 Max. :2163.6 Max. :31613
## (Other) :58822
## LP_CustomerPrincipalPayments LP_InterestandFees LP_ServiceFees
## Min. : 0.0 Min. : 0.0 Min. :-564.85
## 1st Qu.: 286.8 1st Qu.: 221.7 1st Qu.: -73.29
## Median : 795.5 Median : 640.4 Median : -34.75
## Mean : 1519.1 Mean : 1031.3 Mean : -55.72
## 3rd Qu.: 1872.4 3rd Qu.: 1410.9 3rd Qu.: -13.11
## Max. :30831.1 Max. :10572.8 Max. : 0.77
##
## LP_CollectionFees LP_GrossPrincipalLoss LP_NetPrincipalLoss
## Min. :-1242.460 Min. :0 Min. :0
## 1st Qu.: 0.000 1st Qu.:0 1st Qu.:0
## Median : 0.000 Median :0 Median :0
## Mean : -4.171 Mean :0 Mean :0
## 3rd Qu.: 0.000 3rd Qu.:0 3rd Qu.:0
## Max. : 0.000 Max. :0 Max. :0
##
## LP_NonPrincipalRecoverypayments PercentFunded Recommendations
## Min. :0 Min. :0.7000 Min. : 0.000000
## 1st Qu.:0 1st Qu.:1.0000 1st Qu.: 0.000000
## Median :0 Median :1.0000 Median : 0.000000
## Mean :0 Mean :0.9986 Mean : 0.009312
## 3rd Qu.:0 3rd Qu.:1.0000 3rd Qu.: 0.000000
## Max. :0 Max. :1.0125 Max. :19.000000
##
## InvestmentFromFriendsCount InvestmentFromFriendsAmount Investors
## Min. :0.00000 Min. : 0.0000 Min. : 1.00
## 1st Qu.:0.00000 1st Qu.: 0.0000 1st Qu.: 1.00
## Median :0.00000 Median : 0.0000 Median : 8.00
## Mean :0.00226 Mean : 0.6037 Mean : 57.62
## 3rd Qu.:0.00000 3rd Qu.: 0.0000 3rd Qu.: 79.00
## Max. :6.00000 Max. :3000.0000 Max. :779.00
##
## Rating
## C :14528
## B :12208
## A :10943
## D : 8405
## E : 5965
## AA : 3589
## (Other): 3210
All of these open loans have non-zero values assigned to the following measures, suggesting that these measures are predictive, rather than descriptive of actual outcomes: LenderYield, EstimatedEffectiveYield, EstimatedLoss, EstimatedReturn. On the other hand, many open loans have zero values assigned for these profit measures: LP_CustomerPayments, LP_CustomerPrincipalPayments, LP_InterestandFees, LP_ServiceFees, LP_CollectionFees, LP_GrossPrincipalLoss, LP_NetPrincipalLoss, and LP_NonPrincipalRecoverypayments (in fact, the last 3 have only zero values assigned). These I will take a closer look at.
LP_CustomerPayments# data where there are no customer payments
summary(filter(data, LP_CustomerPayments==0))
## ListingKey ListingNumber ListingCreationDate
## 8474358854651984137201C: 4 Min. : 908 Min. :2006-02-28
## 04C13599434217079754AEE: 3 1st Qu.:1159590 1st Qu.:2014-02-03
## 0A0635972629771021E38F3: 3 Median :1188042 Median :2014-02-14
## 26C835968174004476E551B: 3 Mean :1122524 Mean :2013-10-02
## 78D835971025680406A3489: 3 3rd Qu.:1213931 3rd Qu.:2014-02-25
## 873E36032681397836823F7: 3 Max. :1255725 Max. :2014-03-10
## (Other) :6189
## CreditGrade Term LoanStatus
## HR : 125 Min. :12.00 Current :5695
## E : 54 1st Qu.:36.00 Chargedoff : 267
## C : 32 Median :36.00 Defaulted : 199
## D : 26 Mean :43.22 Completed : 10
## B : 22 3rd Qu.:60.00 Past Due (31-60 days): 7
## (Other): 18 Max. :60.00 Past Due (61-90 days): 7
## NA's :5931 (Other) : 23
## ClosedDate BorrowerAPR BorrowerRate LenderYield
## Min. :2006-03-29 Min. :0.00864 Min. :0.0021 Min. :-0.0029
## 1st Qu.:2007-08-13 1st Qu.:0.14243 1st Qu.:0.1159 1st Qu.: 0.1059
## Median :2008-12-31 Median :0.18222 Median :0.1500 Median : 0.1400
## Mean :2009-11-19 Mean :0.18989 Mean :0.1613 Mean : 0.1512
## 3rd Qu.:2012-05-31 3rd Qu.:0.22301 3rd Qu.:0.1920 3rd Qu.: 0.1820
## Max. :2014-03-03 Max. :0.42395 Max. :0.3600 Max. : 0.3400
## NA's :5727
## EstimatedEffectiveYield EstimatedLoss EstimatedReturn
## Min. :-0.01660 Min. :0.00600 Min. :-0.01660
## 1st Qu.: 0.09989 1st Qu.:0.03490 1st Qu.: 0.06491
## Median : 0.12898 Median :0.05740 Median : 0.07349
## Mean : 0.13745 Mean :0.06419 Mean : 0.07387
## 3rd Qu.: 0.16463 3rd Qu.:0.08490 3rd Qu.: 0.08027
## Max. : 0.30570 Max. :0.25000 Max. : 0.19100
## NA's :277 NA's :277 NA's :277
## ProsperRating.num ProsperRating.alpha ProsperScore
## Min. :1.000 C :1634 Min. : 1.000
## 1st Qu.:4.000 A :1354 1st Qu.: 4.000
## Median :5.000 B :1302 Median : 6.000
## Mean :4.637 D : 552 Mean : 5.971
## 3rd Qu.:6.000 AA : 523 3rd Qu.: 8.000
## Max. :7.000 (Other): 566 Max. :11.000
## NA's :277 NA's : 277 NA's :277
## ListingCategory.num BorrowerState Occupation
## 1 :4548 CA : 851 Other :1350
## 7 : 483 TX : 427 Professional : 722
## 2 : 287 NY : 410 Executive : 260
## 0 : 193 FL : 386 Computer Programmer: 213
## 3 : 180 IL : 301 Teacher : 195
## 15 : 109 (Other):3814 (Other) :3127
## (Other): 408 NA's : 19 NA's : 341
## EmploymentStatus EmploymentStatusDuration IsBorrowerHomeowner
## Employed :5087 Min. : 0.00 Mode :logical
## Self-employed: 438 1st Qu.: 26.25 FALSE:3006
## Other : 365 Median : 77.00 TRUE :3202
## Full-time : 192 Mean :104.79
## Not available: 79 3rd Qu.:153.00
## (Other) : 16 Max. :649.00
## NA's : 31 NA's :110
## CurrentlyInGroup GroupKey DateCreditPulled
## Mode :logical 6A3B336601725506917317E: 15 Min. :2006-02-16
## FALSE:6052 3D4D3366260257624AB272D: 12 1st Qu.:2014-02-04
## TRUE :156 783C3371218786870A73D20: 10 Median :2014-02-14
## FEF83377364176536637E50: 9 Mean :2013-10-03
## F3BE336490588367617A2BA: 7 3rd Qu.:2014-02-24
## (Other) : 84 Max. :2014-03-10
## NA's :6071
## CreditScoreRangeLower CreditScoreRangeUpper FirstRecordedCreditLine
## Min. : 0.0 Min. : 19.0 Min. :1953-12-01
## 1st Qu.:660.0 1st Qu.:679.0 1st Qu.:1990-07-01
## Median :680.0 Median :699.0 Median :1996-07-29
## Mean :688.2 Mean :707.2 Mean :1995-06-11
## 3rd Qu.:720.0 3rd Qu.:739.0 3rd Qu.:2001-02-12
## Max. :840.0 Max. :859.0 Max. :2012-09-17
## NA's :1 NA's :1 NA's :3
## CurrentCreditLines OpenCreditLines TotalCreditLinespast7years
## Min. : 0.00 Min. : 0.0 Min. : 2.00
## 1st Qu.: 8.00 1st Qu.: 7.0 1st Qu.: 19.00
## Median :10.00 Median :10.0 Median : 26.00
## Mean :11.34 Mean :10.5 Mean : 27.95
## 3rd Qu.:14.00 3rd Qu.:13.0 3rd Qu.: 35.00
## Max. :54.00 Max. :54.0 Max. :125.00
## NA's :110 NA's :110 NA's :3
## OpenRevolvingAccounts OpenRevolvingMonthlyPayment InquiriesLast6Months
## Min. : 0.000 Min. : 0.0 Min. : 0.000
## 1st Qu.: 5.000 1st Qu.: 196.8 1st Qu.: 0.000
## Median : 7.000 Median : 363.5 Median : 1.000
## Mean : 8.115 Mean : 500.0 Mean : 1.073
## 3rd Qu.:10.000 3rd Qu.: 666.0 3rd Qu.: 1.000
## Max. :50.000 Max. :7090.0 Max. :53.000
## NA's :3
## TotalInquiries CurrentDelinquencies AmountDelinquent
## Min. : 0.000 Min. : 0.0000 Min. : 0.0
## 1st Qu.: 2.000 1st Qu.: 0.0000 1st Qu.: 0.0
## Median : 4.000 Median : 0.0000 Median : 0.0
## Mean : 4.863 Mean : 0.4219 Mean : 679.2
## 3rd Qu.: 7.000 3rd Qu.: 0.0000 3rd Qu.: 0.0
## Max. :70.000 Max. :83.0000 Max. :183396.0
## NA's :7 NA's :3 NA's :111
## DelinquenciesLast7Years PublicRecordsLast10Years
## Min. : 0.000 Min. : 0.0000
## 1st Qu.: 0.000 1st Qu.: 0.0000
## Median : 0.000 Median : 0.0000
## Mean : 3.539 Mean : 0.2967
## 3rd Qu.: 1.000 3rd Qu.: 0.0000
## Max. :99.000 Max. :38.0000
## NA's :6 NA's :3
## PublicRecordsLast12Months RevolvingCreditBalance BankcardUtilization
## Min. :0.00000 Min. : 0 Min. :0.0000
## 1st Qu.:0.00000 1st Qu.: 5154 1st Qu.:0.3600
## Median :0.00000 Median : 11390 Median :0.6100
## Mean :0.00508 Mean : 21079 Mean :0.5784
## 3rd Qu.:0.00000 3rd Qu.: 24664 3rd Qu.:0.8200
## Max. :2.00000 Max. :976426 Max. :1.9000
## NA's :110 NA's :110 NA's :110
## AvailableBankcardCredit TotalTrades TradesNeverDelinquent.per
## Min. : 0 Min. : 1.00 Min. :0.0000
## 1st Qu.: 1762 1st Qu.: 16.00 1st Qu.:0.8800
## Median : 5798 Median : 23.00 Median :0.9700
## Mean : 13349 Mean : 24.48 Mean :0.9195
## 3rd Qu.: 17087 3rd Qu.: 31.00 3rd Qu.:1.0000
## Max. :221237 Max. :108.00 Max. :1.0000
## NA's :110 NA's :110 NA's :110
## TradesOpenedLast6Months DebtToIncomeRatio IncomeRange
## Min. : 0.0000 Min. : 0.0100 $50,000-74,999:1889
## 1st Qu.: 0.0000 1st Qu.: 0.1700 $25,000-49,999:1582
## Median : 1.0000 Median : 0.2400 $100,000+ :1281
## Mean : 0.8129 Mean : 0.2619 $75,000-99,999:1104
## 3rd Qu.: 1.0000 3rd Qu.: 0.3300 $1-24,999 : 232
## Max. :12.0000 Max. :10.0100 Not displayed : 113
## NA's :110 NA's :458 (Other) : 7
## IncomeVerifiable StatedMonthlyIncome LoanKey
## Mode :logical Min. : 0 E6FB37073953690388BC56D: 4
## FALSE:459 1st Qu.: 3723 10D33705822704973E703BB: 3
## TRUE :5749 Median : 5313 1C10370687519959757D4E0: 3
## Mean : 6179 50F23708735181834951669: 3
## 3rd Qu.: 7542 547237051355919565459AB: 3
## Max. :70833 5D463706577381028D227CB: 3
## (Other) :6189
## TotalProsperLoans TotalProsperPaymentsBilled OnTimeProsperPayments
## Min. :1.000 Min. : 1.00 Min. : 0.00
## 1st Qu.:1.000 1st Qu.: 10.00 1st Qu.: 10.00
## Median :1.000 Median : 18.00 Median : 17.00
## Mean :1.556 Mean : 25.45 Mean : 24.87
## 3rd Qu.:2.000 3rd Qu.: 35.00 3rd Qu.: 33.00
## Max. :7.000 Max. :131.00 Max. :131.00
## NA's :5564 NA's :5564 NA's :5564
## ProsperPaymentsLessThanOneMonthLate ProsperPaymentsOneMonthPlusLate
## Min. : 0.000 Min. :0.000
## 1st Qu.: 0.000 1st Qu.:0.000
## Median : 0.000 Median :0.000
## Mean : 0.562 Mean :0.022
## 3rd Qu.: 0.000 3rd Qu.:0.000
## Max. :26.000 Max. :3.000
## NA's :5564 NA's :5564
## ProsperPrincipalBorrowed ProsperPrincipalOutstanding
## Min. : 1000 Min. : 0
## 1st Qu.: 4000 1st Qu.: 0
## Median : 9700 Median : 2258
## Mean :11828 Mean : 4173
## 3rd Qu.:15000 3rd Qu.: 6902
## Max. :63000 Max. :22364
## NA's :5564 NA's :5564
## ScorexChangeAtTimeOfListing LoanCurrentDaysDelinquent
## Min. :-121.000 Min. : 0.00
## 1st Qu.: -38.000 1st Qu.: 0.00
## Median : -1.000 Median : 0.00
## Mean : -5.291 Mean : 60.99
## 3rd Qu.: 22.500 3rd Qu.: 0.00
## Max. : 117.000 Max. :2599.00
## NA's :6129
## LoanFirstDefaultedCycleNumber LoanMonthsSinceOrigination LoanNumber
## Min. : 1.000 Min. : 0.00 Min. : 125
## 1st Qu.: 5.000 1st Qu.: 0.00 1st Qu.:131442
## Median : 5.000 Median : 1.00 Median :132978
## Mean : 5.263 Mean : 5.08 Mean :125607
## 3rd Qu.: 5.000 3rd Qu.: 1.00 3rd Qu.:134493
## Max. :24.000 Max. :96.00 Max. :136486
## NA's :5745
## LoanOriginalAmount LoanOriginationDate LoanOriginationQuarter
## Min. : 1000 Min. :2006-03-06 Q1 2014:5710
## 1st Qu.: 5000 1st Qu.:2014-02-12 Q4 2006: 39
## Median :10000 Median :2014-02-21 Q1 2007: 39
## Mean :11431 Mean :2013-10-12 Q3 2006: 36
## 3rd Qu.:15000 3rd Qu.:2014-03-04 Q2 2007: 32
## Max. :35000 Max. :2014-03-12 Q3 2012: 31
## (Other): 321
## MemberKey MonthlyLoanPayment LP_CustomerPayments
## 0F0C35762146892131F3BB4: 4 Min. : 31.52 Min. :0
## 19C63381132863377E5F08A: 3 1st Qu.: 174.56 1st Qu.:0
## 46B3370043839462265FEAF: 3 Median : 330.13 Median :0
## 51913705343598682656AAA: 3 Mean : 350.06 Mean :0
## 74353588027285527C8B32C: 3 3rd Qu.: 479.09 3rd Qu.:0
## 744F3495355780315032650: 3 Max. :1207.30 Max. :0
## (Other) :6189
## LP_CustomerPrincipalPayments LP_InterestandFees LP_ServiceFees
## Min. :0 Min. :0 Min. :-92.47000
## 1st Qu.:0 1st Qu.:0 1st Qu.: 0.00000
## Median :0 Median :0 Median : 0.00000
## Mean :0 Mean :0 Mean : -0.03636
## 3rd Qu.:0 3rd Qu.:0 3rd Qu.: 0.00000
## Max. :0 Max. :0 Max. : 0.00000
##
## LP_CollectionFees LP_GrossPrincipalLoss LP_NetPrincipalLoss
## Min. :-653.67 Min. : 0.0 Min. : 0.0
## 1st Qu.: 0.00 1st Qu.: 0.0 1st Qu.: 0.0
## Median : 0.00 Median : 0.0 Median : 0.0
## Mean : -1.25 Mean : 356.5 Mean : 352.6
## 3rd Qu.: 0.00 3rd Qu.: 0.0 3rd Qu.: 0.0
## Max. : 0.00 Max. :25000.0 Max. :25000.0
##
## LP_NonPrincipalRecoverypayments PercentFunded Recommendations
## Min. : 0.000 Min. :0.7197 Min. :0.000000
## 1st Qu.: 0.000 1st Qu.:1.0000 1st Qu.:0.000000
## Median : 0.000 Median :1.0000 Median :0.000000
## Mean : 4.963 Mean :0.9999 Mean :0.005799
## 3rd Qu.: 0.000 3rd Qu.:1.0000 3rd Qu.:0.000000
## Max. :2440.000 Max. :1.0000 Max. :2.000000
##
## InvestmentFromFriendsCount InvestmentFromFriendsAmount Investors
## Min. :0.00000 Min. : 0.000 Min. : 1.0
## 1st Qu.:0.00000 1st Qu.: 0.000 1st Qu.: 1.0
## Median :0.00000 Median : 0.000 Median : 1.0
## Mean :0.00145 Mean : 1.898 Mean : 33.7
## 3rd Qu.:0.00000 3rd Qu.: 0.000 3rd Qu.: 18.0
## Max. :1.00000 Max. :7425.000 Max. :745.0
##
## Rating
## C :1666
## A :1366
## B :1324
## D : 578
## AA : 527
## E : 501
## (Other): 246
# closed loans with no customer payments
summary(filter(data, !is.na(ClosedDate) &
LP_CustomerPayments==0))
## ListingKey ListingNumber ListingCreationDate
## 00AF3373975597240A81AE3: 1 Min. : 908 Min. :2006-02-28
## 013433665791725254947A3: 1 1st Qu.:106927 1st Qu.:2007-03-05
## 016D3367858315895FD1C66: 1 Median :371903 Median :2008-07-23
## 017A35059628935388E8DE8: 1 Mean :343341 Mean :2009-05-27
## 03683364814602688549341: 1 3rd Qu.:546908 3rd Qu.:2011-12-27
## 03B03365352072616FEEA72: 1 Max. :932346 Max. :2013-09-26
## (Other) :475
## CreditGrade Term LoanStatus
## HR :125 Min. :12.00 Chargedoff :267
## E : 54 1st Qu.:36.00 Defaulted :199
## C : 32 Median :36.00 Completed : 10
## D : 26 Mean :37.25 Cancelled : 5
## B : 22 3rd Qu.:36.00 Current : 0
## (Other): 18 Max. :60.00 FinalPaymentInProgress: 0
## NA's :204 (Other) : 0
## ClosedDate BorrowerAPR BorrowerRate LenderYield
## Min. :2006-03-29 Min. :0.00864 Min. :0.0021 Min. :-0.0029
## 1st Qu.:2007-08-13 1st Qu.:0.24142 1st Qu.:0.2199 1st Qu.: 0.2088
## Median :2008-12-31 Median :0.29776 Median :0.2875 Median : 0.2700
## Mean :2009-11-19 Mean :0.28514 Mean :0.2612 Mean : 0.2501
## 3rd Qu.:2012-05-31 3rd Qu.:0.35372 3rd Qu.:0.3177 3rd Qu.: 0.3077
## Max. :2014-03-03 Max. :0.42395 Max. :0.3600 Max. : 0.3400
##
## EstimatedEffectiveYield EstimatedLoss EstimatedReturn
## Min. :-0.0166 Min. :0.0060 Min. :-0.0166
## 1st Qu.: 0.1988 1st Qu.:0.1019 1st Qu.: 0.1158
## Median : 0.2760 Median :0.1470 Median : 0.1246
## Mean : 0.2384 Mean :0.1306 Mean : 0.1253
## 3rd Qu.: 0.2896 3rd Qu.:0.1650 3rd Qu.: 0.1439
## Max. : 0.3057 Max. :0.2500 Max. : 0.1910
## NA's :277 NA's :277 NA's :277
## ProsperRating.num ProsperRating.alpha ProsperScore
## Min. :1.000 HR : 74 Min. : 1.000
## 1st Qu.:1.000 E : 51 1st Qu.: 4.000
## Median :2.000 D : 35 Median : 5.000
## Mean :2.368 C : 25 Mean : 4.824
## 3rd Qu.:3.000 B : 14 3rd Qu.: 6.000
## Max. :7.000 (Other): 5 Max. :10.000
## NA's :277 NA's :277 NA's :277
## ListingCategory.num BorrowerState Occupation
## 0 :192 CA : 66 Other :125
## 1 :113 TX : 38 Professional : 33
## 7 : 67 FL : 35 Clerical : 26
## 2 : 25 IL : 32 Administrative Assistant: 22
## 3 : 21 GA : 28 Sales - Commission : 17
## 4 : 14 (Other):263 (Other) :227
## (Other): 49 NA's : 19 NA's : 31
## EmploymentStatus EmploymentStatusDuration IsBorrowerHomeowner
## Full-time :166 Min. : 0.0 Mode :logical
## Employed :152 1st Qu.: 15.0 FALSE:331
## Not available: 79 Median : 43.0 TRUE :150
## Self-employed: 29 Mean : 73.1
## Other : 8 3rd Qu.:102.5
## (Other) : 16 Max. :491.0
## NA's : 31 NA's :110
## CurrentlyInGroup GroupKey DateCreditPulled
## Mode :logical 6A3B336601725506917317E: 15 Min. :2006-02-16
## FALSE:351 3D4D3366260257624AB272D: 12 1st Qu.:2007-02-26
## TRUE :130 783C3371218786870A73D20: 10 Median :2008-07-12
## FEF83377364176536637E50: 9 Mean :2009-05-22
## F3BE336490588367617A2BA: 7 3rd Qu.:2011-12-20
## (Other) : 84 Max. :2013-09-26
## NA's :344
## CreditScoreRangeLower CreditScoreRangeUpper FirstRecordedCreditLine
## Min. : 0.0 Min. : 19.0 Min. :1964-01-01
## 1st Qu.:540.0 1st Qu.:559.0 1st Qu.:1992-03-10
## Median :640.0 Median :659.0 Median :1997-10-24
## Mean :621.3 Mean :640.3 Mean :1996-04-03
## 3rd Qu.:700.0 3rd Qu.:719.0 3rd Qu.:2001-09-10
## Max. :820.0 Max. :839.0 Max. :2010-10-20
## NA's :1 NA's :1 NA's :3
## CurrentCreditLines OpenCreditLines TotalCreditLinespast7years
## Min. : 0.000 Min. : 0.000 Min. : 2.00
## 1st Qu.: 3.000 1st Qu.: 3.000 1st Qu.:11.00
## Median : 7.000 Median : 5.000 Median :18.00
## Mean : 7.372 Mean : 6.404 Mean :21.31
## 3rd Qu.:11.000 3rd Qu.: 9.000 3rd Qu.:28.00
## Max. :41.000 Max. :40.000 Max. :99.00
## NA's :110 NA's :110 NA's :3
## OpenRevolvingAccounts OpenRevolvingMonthlyPayment InquiriesLast6Months
## Min. : 0.00 Min. : 0.0 Min. : 0.000
## 1st Qu.: 1.00 1st Qu.: 0.0 1st Qu.: 1.000
## Median : 3.00 Median : 65.0 Median : 2.000
## Mean : 4.05 Mean : 222.1 Mean : 3.638
## 3rd Qu.: 6.00 3rd Qu.: 255.0 3rd Qu.: 5.000
## Max. :41.00 Max. :5467.0 Max. :53.000
## NA's :3
## TotalInquiries CurrentDelinquencies AmountDelinquent
## Min. : 0.000 Min. : 0.000 Min. : 0
## 1st Qu.: 4.000 1st Qu.: 0.000 1st Qu.: 0
## Median : 7.000 Median : 0.000 Median : 0
## Mean : 9.344 Mean : 3.444 Mean : 3249
## 3rd Qu.:11.000 3rd Qu.: 4.000 3rd Qu.: 655
## Max. :70.000 Max. :83.000 Max. :183396
## NA's :7 NA's :3 NA's :111
## DelinquenciesLast7Years PublicRecordsLast10Years
## Min. : 0.000 Min. : 0.0000
## 1st Qu.: 0.000 1st Qu.: 0.0000
## Median : 0.000 Median : 0.0000
## Mean : 7.339 Mean : 0.4561
## 3rd Qu.: 9.000 3rd Qu.: 1.0000
## Max. :99.000 Max. :13.0000
## NA's :6 NA's :3
## PublicRecordsLast12Months RevolvingCreditBalance BankcardUtilization
## Min. :0.00000 Min. : 0 Min. :0.0000
## 1st Qu.:0.00000 1st Qu.: 255 1st Qu.:0.0100
## Median :0.00000 Median : 2549 Median :0.3800
## Mean :0.03235 Mean : 11643 Mean :0.4461
## 3rd Qu.:0.00000 3rd Qu.: 9612 3rd Qu.:0.8100
## Max. :1.00000 Max. :277236 Max. :1.9000
## NA's :110 NA's :110 NA's :110
## AvailableBankcardCredit TotalTrades TradesNeverDelinquent.per
## Min. : 0.0 Min. : 1.00 Min. :0.0000
## 1st Qu.: 63.5 1st Qu.: 8.00 1st Qu.:0.6600
## Median : 1617.0 Median :15.00 Median :0.9000
## Mean : 8076.8 Mean :17.23 Mean :0.7956
## 3rd Qu.: 9352.0 3rd Qu.:23.00 3rd Qu.:1.0000
## Max. :221237.0 Max. :71.00 Max. :1.0000
## NA's :110 NA's :110 NA's :110
## TradesOpenedLast6Months DebtToIncomeRatio IncomeRange
## Min. : 0.000 Min. : 0.0100 $25,000-49,999:164
## 1st Qu.: 0.000 1st Qu.: 0.0800 Not displayed :113
## Median : 1.000 Median : 0.1600 $50,000-74,999: 74
## Mean : 1.234 Mean : 0.2952 $1-24,999 : 47
## 3rd Qu.: 2.000 3rd Qu.: 0.3000 $75,000-99,999: 42
## Max. :12.000 Max. :10.0100 $100,000+ : 34
## NA's :110 NA's :49 (Other) : 7
## IncomeVerifiable StatedMonthlyIncome LoanKey
## Mode :logical Min. : 0 002E362825862155611E637: 1
## FALSE:50 1st Qu.: 2250 00CB365406197330833A161: 1
## TRUE :431 Median : 3333 01553609568887611BBF798: 1
## Mean : 4167 02E53381916276403AA12CE: 1
## 3rd Qu.: 5083 037C34042225296828F4D0A: 1
## Max. :25000 03B836476405253355E7A2E: 1
## (Other) :475
## TotalProsperLoans TotalProsperPaymentsBilled OnTimeProsperPayments
## Min. :1.000 Min. : 3.00 Min. : 2.00
## 1st Qu.:1.000 1st Qu.: 6.00 1st Qu.: 6.00
## Median :1.000 Median :10.00 Median :10.00
## Mean :1.204 Mean :14.98 Mean :13.93
## 3rd Qu.:1.000 3rd Qu.:20.00 3rd Qu.:17.75
## Max. :3.000 Max. :54.00 Max. :54.00
## NA's :437 NA's :437 NA's :437
## ProsperPaymentsLessThanOneMonthLate ProsperPaymentsOneMonthPlusLate
## Min. : 0.0000 Min. :0.0000
## 1st Qu.: 0.0000 1st Qu.:0.0000
## Median : 0.0000 Median :0.0000
## Mean : 0.9545 Mean :0.0909
## 3rd Qu.: 1.0000 3rd Qu.:0.0000
## Max. :19.0000 Max. :3.0000
## NA's :437 NA's :437
## ProsperPrincipalBorrowed ProsperPrincipalOutstanding
## Min. : 1000 Min. : 0.000
## 1st Qu.: 3000 1st Qu.: 0.008
## Median : 4400 Median : 2011.520
## Mean : 6192 Mean : 3086.748
## 3rd Qu.: 8250 3rd Qu.: 3615.912
## Max. :15900 Max. :14287.750
## NA's :437 NA's :437
## ScorexChangeAtTimeOfListing LoanCurrentDaysDelinquent
## Min. :-121.00 Min. : 0.0
## 1st Qu.: -61.75 1st Qu.: 219.0
## Median : -25.50 Median : 516.0
## Mean : -21.64 Mean : 784.2
## 3rd Qu.: 3.75 3rd Qu.:1105.0
## Max. : 117.00 Max. :2599.0
## NA's :437
## LoanFirstDefaultedCycleNumber LoanMonthsSinceOrigination LoanNumber
## Min. : 1.000 Min. : 5.00 Min. : 125
## 1st Qu.: 5.000 1st Qu.:27.00 1st Qu.: 8496
## Median : 5.000 Median :67.00 Median : 35037
## Mean : 5.263 Mean :57.33 Mean : 35819
## 3rd Qu.: 5.000 3rd Qu.:84.00 3rd Qu.: 58275
## Max. :24.000 Max. :96.00 Max. :103467
## NA's :18
## LoanOriginalAmount LoanOriginationDate LoanOriginationQuarter
## Min. : 1000 Min. :2006-03-06 Q4 2006: 39
## 1st Qu.: 2100 1st Qu.:2007-03-14 Q1 2007: 39
## Median : 3500 Median :2008-08-05 Q3 2006: 36
## Mean : 4709 Mean :2009-06-06 Q2 2007: 32
## 3rd Qu.: 5000 3rd Qu.:2011-12-30 Q3 2012: 31
## Max. :25000 Max. :2013-10-01 Q3 2008: 27
## (Other):277
## MemberKey MonthlyLoanPayment LP_CustomerPayments
## 00213395573199409CDA304: 1 Min. : 31.52 Min. :0
## 010434177852874428479BC: 1 1st Qu.: 87.49 1st Qu.:0
## 011D3380183567215AECD54: 1 Median : 135.28 Median :0
## 01873419418028013A65F9B: 1 Mean : 184.44 Mean :0
## 01963423813444177D162DE: 1 3rd Qu.: 209.53 3rd Qu.:0
## 01F23424487177149681C65: 1 Max. :1047.64 Max. :0
## (Other) :475
## LP_CustomerPrincipalPayments LP_InterestandFees LP_ServiceFees
## Min. :0 Min. :0 Min. :-92.4700
## 1st Qu.:0 1st Qu.:0 1st Qu.: 0.0000
## Median :0 Median :0 Median : 0.0000
## Mean :0 Mean :0 Mean : -0.4692
## 3rd Qu.:0 3rd Qu.:0 3rd Qu.: 0.0000
## Max. :0 Max. :0 Max. : 0.0000
##
## LP_CollectionFees LP_GrossPrincipalLoss LP_NetPrincipalLoss
## Min. :-653.67 Min. : 0 Min. : 0
## 1st Qu.: 0.00 1st Qu.: 2000 1st Qu.: 2000
## Median : 0.00 Median : 3500 Median : 3300
## Mean : -16.14 Mean : 4601 Mean : 4550
## 3rd Qu.: 0.00 3rd Qu.: 5000 3rd Qu.: 5000
## Max. : 0.00 Max. :25000 Max. :25000
##
## LP_NonPrincipalRecoverypayments PercentFunded Recommendations
## Min. : 0.00 Min. :0.7197 Min. :0.00000
## 1st Qu.: 0.00 1st Qu.:1.0000 1st Qu.:0.00000
## Median : 0.00 Median :1.0000 Median :0.00000
## Mean : 64.06 Mean :0.9986 Mean :0.05405
## 3rd Qu.: 0.00 3rd Qu.:1.0000 3rd Qu.:0.00000
## Max. :2440.00 Max. :1.0000 Max. :2.00000
##
## InvestmentFromFriendsCount InvestmentFromFriendsAmount Investors
## Min. :0.00000 Min. : 0.00 Min. : 1.00
## 1st Qu.:0.00000 1st Qu.: 0.00 1st Qu.: 16.00
## Median :0.00000 Median : 0.00 Median : 36.00
## Mean :0.01871 Mean : 24.49 Mean : 58.08
## 3rd Qu.:0.00000 3rd Qu.: 0.00 3rd Qu.: 71.00
## Max. :1.00000 Max. :7425.00 Max. :745.00
##
## Rating
## HR :199
## E :105
## D : 61
## C : 57
## B : 36
## A : 15
## (Other): 8
# closed completed loans with no customer payments
summary(filter(data, !is.na(ClosedDate) &
LP_CustomerPayments==0 &
LoanStatus=="Completed"))
## ListingKey ListingNumber ListingCreationDate
## 0D113451667173664D2D2EB:1 Min. :415054 Min. :2009-04-28
## 21F63451223082614E3D321:1 1st Qu.:415118 1st Qu.:2009-04-28
## 27D034509373504094A753E:1 Median :415327 Median :2009-04-29
## 36C63450215037018088662:1 Mean :415293 Mean :2009-04-29
## 44773450501513236DCFCEA:1 3rd Qu.:415412 3rd Qu.:2009-04-30
## 532034516025331845B3905:1 Max. :415577 Max. :2009-05-02
## (Other) :4
## CreditGrade Term LoanStatus
## C :3 Min. :36 Completed :10
## B :3 1st Qu.:36 Cancelled : 0
## D :2 Median :36 Chargedoff : 0
## A :1 Mean :36 Current : 0
## AA :1 3rd Qu.:36 Defaulted : 0
## NC :0 Max. :36 FinalPaymentInProgress: 0
## (Other):0 (Other) : 0
## ClosedDate BorrowerAPR BorrowerRate LenderYield
## Min. :2009-06-15 Min. :0.09677 Min. :0.0760 Min. :0.0660
## 1st Qu.:2010-11-26 1st Qu.:0.12887 1st Qu.:0.1018 1st Qu.:0.0918
## Median :2011-08-19 Median :0.20999 Median :0.1632 Median :0.1532
## Mean :2011-05-26 Mean :0.22253 Mean :0.1863 Mean :0.1763
## 3rd Qu.:2012-02-24 3rd Qu.:0.27000 3rd Qu.:0.2335 3rd Qu.:0.2235
## Max. :2012-05-13 Max. :0.39951 Max. :0.3500 Max. :0.3400
##
## EstimatedEffectiveYield EstimatedLoss EstimatedReturn ProsperRating.num
## Min. : NA Min. : NA Min. : NA Min. : NA
## 1st Qu.: NA 1st Qu.: NA 1st Qu.: NA 1st Qu.: NA
## Median : NA Median : NA Median : NA Median : NA
## Mean :NaN Mean :NaN Mean :NaN Mean :NaN
## 3rd Qu.: NA 3rd Qu.: NA 3rd Qu.: NA 3rd Qu.: NA
## Max. : NA Max. : NA Max. : NA Max. : NA
## NA's :10 NA's :10 NA's :10 NA's :10
## ProsperRating.alpha ProsperScore ListingCategory.num BorrowerState
## NC : 0 Min. : NA 1 :4 GA :3
## HR : 0 1st Qu.: NA 5 :2 IL :1
## E : 0 Median : NA 7 :2 MD :1
## D : 0 Mean :NaN 3 :1 MN :1
## C : 0 3rd Qu.: NA 6 :1 NJ :1
## (Other): 0 Max. : NA 0 :0 OH :1
## NA's :10 NA's :10 (Other):0 (Other):2
## Occupation EmploymentStatus
## Clerical :1 Full-time :9
## Computer Programmer :1 Retired :1
## Engineer - Electrical :1 Employed :0
## Food Service Management:1 Not available:0
## Military Enlisted :1 Not employed :0
## Other :1 Other :0
## (Other) :4 (Other) :0
## EmploymentStatusDuration IsBorrowerHomeowner CurrentlyInGroup
## Min. : 7.00 Mode :logical Mode :logical
## 1st Qu.: 29.00 FALSE:5 FALSE:9
## Median : 65.00 TRUE :5 TRUE :1
## Mean : 74.60
## 3rd Qu.: 91.75
## Max. :201.00
##
## GroupKey DateCreditPulled CreditScoreRangeLower
## FEF83377364176536637E50:1 Min. :2009-04-28 Min. :620
## 00343376901312423168731:0 1st Qu.:2009-04-28 1st Qu.:640
## 00943382969547936B0C529:0 Median :2009-04-28 Median :670
## 00AE3392027644405556335:0 Mean :2009-04-28 Mean :682
## 016833805323396548B2370:0 3rd Qu.:2009-04-28 3rd Qu.:700
## (Other) :0 Max. :2009-04-30 Max. :820
## NA's :9
## CreditScoreRangeUpper FirstRecordedCreditLine CurrentCreditLines
## Min. :639 Min. :1976-02-25 Min. : 3.00
## 1st Qu.:659 1st Qu.:1989-04-08 1st Qu.: 6.25
## Median :689 Median :1994-01-04 Median :10.00
## Mean :701 Mean :1994-02-22 Mean :10.80
## 3rd Qu.:719 3rd Qu.:2001-10-18 3rd Qu.:14.25
## Max. :839 Max. :2007-05-17 Max. :21.00
##
## OpenCreditLines TotalCreditLinespast7years OpenRevolvingAccounts
## Min. : 3.00 Min. : 7.0 Min. :2.00
## 1st Qu.: 6.25 1st Qu.:15.0 1st Qu.:4.25
## Median : 8.50 Median :21.5 Median :6.50
## Mean : 8.70 Mean :30.4 Mean :6.00
## 3rd Qu.:11.25 3rd Qu.:39.0 3rd Qu.:7.75
## Max. :14.00 Max. :78.0 Max. :9.00
##
## OpenRevolvingMonthlyPayment InquiriesLast6Months TotalInquiries
## Min. : 15.0 Min. :0.0 Min. : 1.00
## 1st Qu.: 98.5 1st Qu.:0.0 1st Qu.: 3.25
## Median :171.0 Median :0.0 Median : 5.00
## Mean :223.3 Mean :1.3 Mean : 6.70
## 3rd Qu.:269.2 3rd Qu.:1.0 3rd Qu.: 7.25
## Max. :840.0 Max. :8.0 Max. :23.00
##
## CurrentDelinquencies AmountDelinquent DelinquenciesLast7Years
## Min. :0 Min. :0 Min. :0.0
## 1st Qu.:0 1st Qu.:0 1st Qu.:0.0
## Median :0 Median :0 Median :0.0
## Mean :0 Mean :0 Mean :0.9
## 3rd Qu.:0 3rd Qu.:0 3rd Qu.:0.0
## Max. :0 Max. :0 Max. :5.0
##
## PublicRecordsLast10Years PublicRecordsLast12Months RevolvingCreditBalance
## Min. :0.0 Min. :0 Min. : 498
## 1st Qu.:0.0 1st Qu.:0 1st Qu.: 1867
## Median :0.0 Median :0 Median : 4888
## Mean :0.1 Mean :0 Mean :10410
## 3rd Qu.:0.0 3rd Qu.:0 3rd Qu.:12075
## Max. :1.0 Max. :0 Max. :35522
##
## BankcardUtilization AvailableBankcardCredit TotalTrades
## Min. :0.0500 Min. : 129 Min. : 3.0
## 1st Qu.:0.3000 1st Qu.: 607 1st Qu.:13.0
## Median :0.4250 Median : 4952 Median :18.5
## Mean :0.5240 Mean :10860 Mean :25.7
## 3rd Qu.:0.8475 3rd Qu.:19643 3rd Qu.:37.0
## Max. :0.9900 Max. :33730 Max. :68.0
##
## TradesNeverDelinquent.per TradesOpenedLast6Months DebtToIncomeRatio
## Min. :0.420 Min. :0.0 Min. :0.1000
## 1st Qu.:0.890 1st Qu.:0.0 1st Qu.:0.1225
## Median :0.960 Median :1.0 Median :0.1650
## Mean :0.896 Mean :1.4 Mean :0.2700
## 3rd Qu.:1.000 3rd Qu.:2.0 3rd Qu.:0.2175
## Max. :1.000 Max. :6.0 Max. :1.2700
##
## IncomeRange IncomeVerifiable StatedMonthlyIncome
## $25,000-49,999:5 Mode:logical Min. : 66.67
## $50,000-74,999:2 TRUE:10 1st Qu.: 3147.50
## $100,000+ :2 Median : 3817.25
## $1-24,999 :1 Mean : 4753.66
## Not displayed :0 3rd Qu.: 5750.00
## Not employed :0 Max. :10885.42
## (Other) :0
## LoanKey TotalProsperLoans TotalProsperPaymentsBilled
## 04B13555695010627F64371:1 Min. :1.000 Min. : 6.00
## 07D135567368272216AB044:1 1st Qu.:1.000 1st Qu.:10.00
## 158335561462412446E1A4D:1 Median :1.000 Median :11.00
## 24973555335786563CA1C8D:1 Mean :1.444 Mean :13.67
## 98F5355630240037783B9E6:1 3rd Qu.:2.000 3rd Qu.:13.00
## 9B4B355449621339172BFA4:1 Max. :3.000 Max. :33.00
## (Other) :4 NA's :1 NA's :1
## OnTimeProsperPayments ProsperPaymentsLessThanOneMonthLate
## Min. : 6.00 Min. :0
## 1st Qu.:10.00 1st Qu.:0
## Median :11.00 Median :0
## Mean :13.67 Mean :0
## 3rd Qu.:13.00 3rd Qu.:0
## Max. :33.00 Max. :0
## NA's :1 NA's :1
## ProsperPaymentsOneMonthPlusLate ProsperPrincipalBorrowed
## Min. :0 Min. : 1000
## 1st Qu.:0 1st Qu.: 3000
## Median :0 Median : 5102
## Mean :0 Mean : 6178
## 3rd Qu.:0 3rd Qu.: 9000
## Max. :0 Max. :15500
## NA's :1 NA's :1
## ProsperPrincipalOutstanding ScorexChangeAtTimeOfListing
## Min. : 0.0 Min. :-57.000
## 1st Qu.: 0.0 1st Qu.:-22.000
## Median : 481.6 Median : -2.000
## Mean :1243.7 Mean : 7.222
## 3rd Qu.:2260.8 3rd Qu.: 21.000
## Max. :3987.8 Max. :107.000
## NA's :1 NA's :1
## LoanCurrentDaysDelinquent LoanFirstDefaultedCycleNumber
## Min. :0 Min. : NA
## 1st Qu.:0 1st Qu.: NA
## Median :0 Median : NA
## Mean :0 Mean :NaN
## 3rd Qu.:0 3rd Qu.: NA
## Max. :0 Max. : NA
## NA's :10
## LoanMonthsSinceOrigination LoanNumber LoanOriginalAmount
## Min. :58 Min. :38031 Min. :1000
## 1st Qu.:58 1st Qu.:38033 1st Qu.:1125
## Median :58 Median :38036 Median :1850
## Mean :58 Mean :38036 Mean :2270
## 3rd Qu.:58 3rd Qu.:38038 3rd Qu.:3000
## Max. :58 Max. :38044 Max. :5000
##
## LoanOriginationDate LoanOriginationQuarter MemberKey
## Min. :2009-05-06 Q2 2009:10 01873419418028013A65F9B:1
## 1st Qu.:2009-05-07 Q4 2005: 0 0588342364795854665007E:1
## Median :2009-05-12 Q1 2006: 0 08B9341462500905990325D:1
## Mean :2009-05-10 Q2 2006: 0 1177340984660368892073C:1
## 3rd Qu.:2009-05-13 Q3 2006: 0 43D93390371566774874F59:1
## Max. :2009-05-14 Q4 2006: 0 63CA34120866140639431C9:1
## (Other): 0 (Other) :4
## MonthlyLoanPayment LP_CustomerPayments LP_CustomerPrincipalPayments
## Min. : 35.40 Min. :0 Min. :0
## 1st Qu.: 46.57 1st Qu.:0 1st Qu.:0
## Median : 60.97 Median :0 Median :0
## Mean : 81.66 Mean :0 Mean :0
## 3rd Qu.:108.95 3rd Qu.:0 3rd Qu.:0
## Max. :158.33 Max. :0 Max. :0
##
## LP_InterestandFees LP_ServiceFees LP_CollectionFees LP_GrossPrincipalLoss
## Min. :0 Min. :0 Min. :0 Min. :0
## 1st Qu.:0 1st Qu.:0 1st Qu.:0 1st Qu.:0
## Median :0 Median :0 Median :0 Median :0
## Mean :0 Mean :0 Mean :0 Mean :0
## 3rd Qu.:0 3rd Qu.:0 3rd Qu.:0 3rd Qu.:0
## Max. :0 Max. :0 Max. :0 Max. :0
##
## LP_NetPrincipalLoss LP_NonPrincipalRecoverypayments PercentFunded
## Min. :0 Min. :0 Min. :1
## 1st Qu.:0 1st Qu.:0 1st Qu.:1
## Median :0 Median :0 Median :1
## Mean :0 Mean :0 Mean :1
## 3rd Qu.:0 3rd Qu.:0 3rd Qu.:1
## Max. :0 Max. :0 Max. :1
##
## Recommendations InvestmentFromFriendsCount InvestmentFromFriendsAmount
## Min. :0.0 Min. :0 Min. :0
## 1st Qu.:0.0 1st Qu.:0 1st Qu.:0
## Median :0.0 Median :0 Median :0
## Mean :0.2 Mean :0 Mean :0
## 3rd Qu.:0.0 3rd Qu.:0 3rd Qu.:0
## Max. :1.0 Max. :0 Max. :0
##
## Investors Rating
## Min. :15.00 C :3
## 1st Qu.:21.75 B :3
## Median :42.00 D :2
## Mean :39.90 A :1
## 3rd Qu.:48.50 AA :1
## Max. :79.00 NC :0
## (Other):0
# all completed loans with no customer payments
summary(filter(data, LoanStatus=="Completed" &
LP_CustomerPayments==0))
## ListingKey ListingNumber ListingCreationDate
## 0D113451667173664D2D2EB:1 Min. :415054 Min. :2009-04-28
## 21F63451223082614E3D321:1 1st Qu.:415118 1st Qu.:2009-04-28
## 27D034509373504094A753E:1 Median :415327 Median :2009-04-29
## 36C63450215037018088662:1 Mean :415293 Mean :2009-04-29
## 44773450501513236DCFCEA:1 3rd Qu.:415412 3rd Qu.:2009-04-30
## 532034516025331845B3905:1 Max. :415577 Max. :2009-05-02
## (Other) :4
## CreditGrade Term LoanStatus
## C :3 Min. :36 Completed :10
## B :3 1st Qu.:36 Cancelled : 0
## D :2 Median :36 Chargedoff : 0
## A :1 Mean :36 Current : 0
## AA :1 3rd Qu.:36 Defaulted : 0
## NC :0 Max. :36 FinalPaymentInProgress: 0
## (Other):0 (Other) : 0
## ClosedDate BorrowerAPR BorrowerRate LenderYield
## Min. :2009-06-15 Min. :0.09677 Min. :0.0760 Min. :0.0660
## 1st Qu.:2010-11-26 1st Qu.:0.12887 1st Qu.:0.1018 1st Qu.:0.0918
## Median :2011-08-19 Median :0.20999 Median :0.1632 Median :0.1532
## Mean :2011-05-26 Mean :0.22253 Mean :0.1863 Mean :0.1763
## 3rd Qu.:2012-02-24 3rd Qu.:0.27000 3rd Qu.:0.2335 3rd Qu.:0.2235
## Max. :2012-05-13 Max. :0.39951 Max. :0.3500 Max. :0.3400
##
## EstimatedEffectiveYield EstimatedLoss EstimatedReturn ProsperRating.num
## Min. : NA Min. : NA Min. : NA Min. : NA
## 1st Qu.: NA 1st Qu.: NA 1st Qu.: NA 1st Qu.: NA
## Median : NA Median : NA Median : NA Median : NA
## Mean :NaN Mean :NaN Mean :NaN Mean :NaN
## 3rd Qu.: NA 3rd Qu.: NA 3rd Qu.: NA 3rd Qu.: NA
## Max. : NA Max. : NA Max. : NA Max. : NA
## NA's :10 NA's :10 NA's :10 NA's :10
## ProsperRating.alpha ProsperScore ListingCategory.num BorrowerState
## NC : 0 Min. : NA 1 :4 GA :3
## HR : 0 1st Qu.: NA 5 :2 IL :1
## E : 0 Median : NA 7 :2 MD :1
## D : 0 Mean :NaN 3 :1 MN :1
## C : 0 3rd Qu.: NA 6 :1 NJ :1
## (Other): 0 Max. : NA 0 :0 OH :1
## NA's :10 NA's :10 (Other):0 (Other):2
## Occupation EmploymentStatus
## Clerical :1 Full-time :9
## Computer Programmer :1 Retired :1
## Engineer - Electrical :1 Employed :0
## Food Service Management:1 Not available:0
## Military Enlisted :1 Not employed :0
## Other :1 Other :0
## (Other) :4 (Other) :0
## EmploymentStatusDuration IsBorrowerHomeowner CurrentlyInGroup
## Min. : 7.00 Mode :logical Mode :logical
## 1st Qu.: 29.00 FALSE:5 FALSE:9
## Median : 65.00 TRUE :5 TRUE :1
## Mean : 74.60
## 3rd Qu.: 91.75
## Max. :201.00
##
## GroupKey DateCreditPulled CreditScoreRangeLower
## FEF83377364176536637E50:1 Min. :2009-04-28 Min. :620
## 00343376901312423168731:0 1st Qu.:2009-04-28 1st Qu.:640
## 00943382969547936B0C529:0 Median :2009-04-28 Median :670
## 00AE3392027644405556335:0 Mean :2009-04-28 Mean :682
## 016833805323396548B2370:0 3rd Qu.:2009-04-28 3rd Qu.:700
## (Other) :0 Max. :2009-04-30 Max. :820
## NA's :9
## CreditScoreRangeUpper FirstRecordedCreditLine CurrentCreditLines
## Min. :639 Min. :1976-02-25 Min. : 3.00
## 1st Qu.:659 1st Qu.:1989-04-08 1st Qu.: 6.25
## Median :689 Median :1994-01-04 Median :10.00
## Mean :701 Mean :1994-02-22 Mean :10.80
## 3rd Qu.:719 3rd Qu.:2001-10-18 3rd Qu.:14.25
## Max. :839 Max. :2007-05-17 Max. :21.00
##
## OpenCreditLines TotalCreditLinespast7years OpenRevolvingAccounts
## Min. : 3.00 Min. : 7.0 Min. :2.00
## 1st Qu.: 6.25 1st Qu.:15.0 1st Qu.:4.25
## Median : 8.50 Median :21.5 Median :6.50
## Mean : 8.70 Mean :30.4 Mean :6.00
## 3rd Qu.:11.25 3rd Qu.:39.0 3rd Qu.:7.75
## Max. :14.00 Max. :78.0 Max. :9.00
##
## OpenRevolvingMonthlyPayment InquiriesLast6Months TotalInquiries
## Min. : 15.0 Min. :0.0 Min. : 1.00
## 1st Qu.: 98.5 1st Qu.:0.0 1st Qu.: 3.25
## Median :171.0 Median :0.0 Median : 5.00
## Mean :223.3 Mean :1.3 Mean : 6.70
## 3rd Qu.:269.2 3rd Qu.:1.0 3rd Qu.: 7.25
## Max. :840.0 Max. :8.0 Max. :23.00
##
## CurrentDelinquencies AmountDelinquent DelinquenciesLast7Years
## Min. :0 Min. :0 Min. :0.0
## 1st Qu.:0 1st Qu.:0 1st Qu.:0.0
## Median :0 Median :0 Median :0.0
## Mean :0 Mean :0 Mean :0.9
## 3rd Qu.:0 3rd Qu.:0 3rd Qu.:0.0
## Max. :0 Max. :0 Max. :5.0
##
## PublicRecordsLast10Years PublicRecordsLast12Months RevolvingCreditBalance
## Min. :0.0 Min. :0 Min. : 498
## 1st Qu.:0.0 1st Qu.:0 1st Qu.: 1867
## Median :0.0 Median :0 Median : 4888
## Mean :0.1 Mean :0 Mean :10410
## 3rd Qu.:0.0 3rd Qu.:0 3rd Qu.:12075
## Max. :1.0 Max. :0 Max. :35522
##
## BankcardUtilization AvailableBankcardCredit TotalTrades
## Min. :0.0500 Min. : 129 Min. : 3.0
## 1st Qu.:0.3000 1st Qu.: 607 1st Qu.:13.0
## Median :0.4250 Median : 4952 Median :18.5
## Mean :0.5240 Mean :10860 Mean :25.7
## 3rd Qu.:0.8475 3rd Qu.:19643 3rd Qu.:37.0
## Max. :0.9900 Max. :33730 Max. :68.0
##
## TradesNeverDelinquent.per TradesOpenedLast6Months DebtToIncomeRatio
## Min. :0.420 Min. :0.0 Min. :0.1000
## 1st Qu.:0.890 1st Qu.:0.0 1st Qu.:0.1225
## Median :0.960 Median :1.0 Median :0.1650
## Mean :0.896 Mean :1.4 Mean :0.2700
## 3rd Qu.:1.000 3rd Qu.:2.0 3rd Qu.:0.2175
## Max. :1.000 Max. :6.0 Max. :1.2700
##
## IncomeRange IncomeVerifiable StatedMonthlyIncome
## $25,000-49,999:5 Mode:logical Min. : 66.67
## $50,000-74,999:2 TRUE:10 1st Qu.: 3147.50
## $100,000+ :2 Median : 3817.25
## $1-24,999 :1 Mean : 4753.66
## Not displayed :0 3rd Qu.: 5750.00
## Not employed :0 Max. :10885.42
## (Other) :0
## LoanKey TotalProsperLoans TotalProsperPaymentsBilled
## 04B13555695010627F64371:1 Min. :1.000 Min. : 6.00
## 07D135567368272216AB044:1 1st Qu.:1.000 1st Qu.:10.00
## 158335561462412446E1A4D:1 Median :1.000 Median :11.00
## 24973555335786563CA1C8D:1 Mean :1.444 Mean :13.67
## 98F5355630240037783B9E6:1 3rd Qu.:2.000 3rd Qu.:13.00
## 9B4B355449621339172BFA4:1 Max. :3.000 Max. :33.00
## (Other) :4 NA's :1 NA's :1
## OnTimeProsperPayments ProsperPaymentsLessThanOneMonthLate
## Min. : 6.00 Min. :0
## 1st Qu.:10.00 1st Qu.:0
## Median :11.00 Median :0
## Mean :13.67 Mean :0
## 3rd Qu.:13.00 3rd Qu.:0
## Max. :33.00 Max. :0
## NA's :1 NA's :1
## ProsperPaymentsOneMonthPlusLate ProsperPrincipalBorrowed
## Min. :0 Min. : 1000
## 1st Qu.:0 1st Qu.: 3000
## Median :0 Median : 5102
## Mean :0 Mean : 6178
## 3rd Qu.:0 3rd Qu.: 9000
## Max. :0 Max. :15500
## NA's :1 NA's :1
## ProsperPrincipalOutstanding ScorexChangeAtTimeOfListing
## Min. : 0.0 Min. :-57.000
## 1st Qu.: 0.0 1st Qu.:-22.000
## Median : 481.6 Median : -2.000
## Mean :1243.7 Mean : 7.222
## 3rd Qu.:2260.8 3rd Qu.: 21.000
## Max. :3987.8 Max. :107.000
## NA's :1 NA's :1
## LoanCurrentDaysDelinquent LoanFirstDefaultedCycleNumber
## Min. :0 Min. : NA
## 1st Qu.:0 1st Qu.: NA
## Median :0 Median : NA
## Mean :0 Mean :NaN
## 3rd Qu.:0 3rd Qu.: NA
## Max. :0 Max. : NA
## NA's :10
## LoanMonthsSinceOrigination LoanNumber LoanOriginalAmount
## Min. :58 Min. :38031 Min. :1000
## 1st Qu.:58 1st Qu.:38033 1st Qu.:1125
## Median :58 Median :38036 Median :1850
## Mean :58 Mean :38036 Mean :2270
## 3rd Qu.:58 3rd Qu.:38038 3rd Qu.:3000
## Max. :58 Max. :38044 Max. :5000
##
## LoanOriginationDate LoanOriginationQuarter MemberKey
## Min. :2009-05-06 Q2 2009:10 01873419418028013A65F9B:1
## 1st Qu.:2009-05-07 Q4 2005: 0 0588342364795854665007E:1
## Median :2009-05-12 Q1 2006: 0 08B9341462500905990325D:1
## Mean :2009-05-10 Q2 2006: 0 1177340984660368892073C:1
## 3rd Qu.:2009-05-13 Q3 2006: 0 43D93390371566774874F59:1
## Max. :2009-05-14 Q4 2006: 0 63CA34120866140639431C9:1
## (Other): 0 (Other) :4
## MonthlyLoanPayment LP_CustomerPayments LP_CustomerPrincipalPayments
## Min. : 35.40 Min. :0 Min. :0
## 1st Qu.: 46.57 1st Qu.:0 1st Qu.:0
## Median : 60.97 Median :0 Median :0
## Mean : 81.66 Mean :0 Mean :0
## 3rd Qu.:108.95 3rd Qu.:0 3rd Qu.:0
## Max. :158.33 Max. :0 Max. :0
##
## LP_InterestandFees LP_ServiceFees LP_CollectionFees LP_GrossPrincipalLoss
## Min. :0 Min. :0 Min. :0 Min. :0
## 1st Qu.:0 1st Qu.:0 1st Qu.:0 1st Qu.:0
## Median :0 Median :0 Median :0 Median :0
## Mean :0 Mean :0 Mean :0 Mean :0
## 3rd Qu.:0 3rd Qu.:0 3rd Qu.:0 3rd Qu.:0
## Max. :0 Max. :0 Max. :0 Max. :0
##
## LP_NetPrincipalLoss LP_NonPrincipalRecoverypayments PercentFunded
## Min. :0 Min. :0 Min. :1
## 1st Qu.:0 1st Qu.:0 1st Qu.:1
## Median :0 Median :0 Median :1
## Mean :0 Mean :0 Mean :1
## 3rd Qu.:0 3rd Qu.:0 3rd Qu.:1
## Max. :0 Max. :0 Max. :1
##
## Recommendations InvestmentFromFriendsCount InvestmentFromFriendsAmount
## Min. :0.0 Min. :0 Min. :0
## 1st Qu.:0.0 1st Qu.:0 1st Qu.:0
## Median :0.0 Median :0 Median :0
## Mean :0.2 Mean :0 Mean :0
## 3rd Qu.:0.0 3rd Qu.:0 3rd Qu.:0
## Max. :1.0 Max. :0 Max. :0
##
## Investors Rating
## Min. :15.00 C :3
## 1st Qu.:21.75 B :3
## Median :42.00 D :2
## Mean :39.90 A :1
## 3rd Qu.:48.50 AA :1
## Max. :79.00 NC :0
## (Other):0
The one thing I notice is that the majority of loans without customer payments originated in 2014. Most, but not all, are still open. Other LP values for most of them are 0. Of those that are closed, most were either charged off, defaulted, or cancelled - only 10 were completed. Of those that were completed, all are from the same loan origination quarter, have the same number of months since the loan originated, and are assigned loan numbers adjacent to each other. All LP values are 0.
If the loan has been completed, then with the exception of those 10 records, all customers have made payments.
This leads me to strongly suspect that for this group of 10 loans, the data is simply missing, perhaps through system error. I therefore tentatively conclude that for this measure ,the value is 0 only for those borrowers whole loans are still open, or whole loans were charged off, defaulted, or cancelled.
LP_CollectionFees, LP_NetPrincipalLoss, LP_NonPrincipalRecoverypayments# loans with collection fees
summary(filter(data, LP_CollectionFees!=0))
## ListingKey ListingNumber ListingCreationDate
## 0CDD3589734051739A10B58: 2 Min. : 28 Min. :2005-11-21
## 426B3588416323222A031B6: 2 1st Qu.: 188371 1st Qu.:2007-08-16
## 87EE35921635083297DEB55: 2 Median : 404988 Median :2008-09-26
## 0005353671687550573289D: 1 Mean : 393687 Mean :2009-10-30
## 000D348547019249114C31E: 1 3rd Qu.: 564927 3rd Qu.:2012-03-03
## 001035373445372274F74E2: 1 Max. :1117488 Max. :2014-01-09
## (Other) :8157
## CreditGrade Term LoanStatus
## D : 836 Min. :12.00 Chargedoff :3510
## C : 802 1st Qu.:36.00 Defaulted :1381
## HR : 721 Median :36.00 Completed :1339
## E : 636 Mean :38.32 Current :1034
## B : 577 3rd Qu.:36.00 Past Due (1-15 days) : 379
## (Other): 546 Max. :60.00 Past Due (31-60 days): 149
## NA's :4048 (Other) : 374
## ClosedDate BorrowerAPR BorrowerRate LenderYield
## Min. :2006-09-22 Min. :0.01315 Min. :0.0000 Min. :-0.0100
## 1st Qu.:2009-04-04 1st Qu.:0.19212 1st Qu.:0.1750 1st Qu.: 0.1650
## Median :2010-05-29 Median :0.25757 Median :0.2375 Median : 0.2250
## Mean :2010-10-20 Mean :0.25496 Mean :0.2317 Mean : 0.2211
## 3rd Qu.:2012-09-11 3rd Qu.:0.31375 3rd Qu.:0.2900 3rd Qu.: 0.2809
## Max. :2014-03-10 Max. :0.45857 Max. :0.4500 Max. : 0.4325
## NA's :1936 NA's :1
## EstimatedEffectiveYield EstimatedLoss EstimatedReturn
## Min. :-0.126 Min. :0.006 Min. :-0.126
## 1st Qu.: 0.164 1st Qu.:0.072 1st Qu.: 0.105
## Median : 0.231 Median :0.108 Median : 0.124
## Mean : 0.215 Mean :0.110 Mean : 0.120
## 3rd Qu.: 0.280 3rd Qu.:0.147 3rd Qu.: 0.141
## Max. : 0.320 Max. :0.366 Max. : 0.284
## NA's :4118 NA's :4118 NA's :4118
## ProsperRating.num ProsperRating.alpha ProsperScore
## Min. :1.000 D :1040 Min. : 1.000
## 1st Qu.:2.000 E : 806 1st Qu.: 4.000
## Median :3.000 HR : 791 Median : 5.000
## Mean :3.031 C : 665 Mean : 5.342
## 3rd Qu.:4.000 B : 439 3rd Qu.: 7.000
## Max. :7.000 (Other): 307 Max. :11.000
## NA's :4118 NA's :4118 NA's :4118
## ListingCategory.num BorrowerState Occupation
## 1 :2788 CA :1008 Other :2176
## 0 :2485 IL : 467 Professional : 926
## 7 : 773 GA : 453 Administrative Assistant: 357
## 3 : 606 NY : 443 Teacher : 339
## 2 : 475 FL : 429 Sales - Commission : 317
## 4 : 321 (Other):4634 (Other) :3757
## (Other): 718 NA's : 732 NA's : 294
## EmploymentStatus EmploymentStatusDuration IsBorrowerHomeowner
## Full-time :3200 Min. : 0.00 Mode :logical
## Employed :2937 1st Qu.: 20.00 FALSE:4519
## Not available: 753 Median : 53.00 TRUE :3647
## Self-employed: 441 Mean : 84.12
## Other : 185 3rd Qu.:118.00
## (Other) : 365 Max. :732.00
## NA's : 285 NA's :1038
## CurrentlyInGroup GroupKey DateCreditPulled
## Mode :logical 783C3371218786870A73D20: 197 Min. :2005-11-21
## FALSE:6347 FEF83377364176536637E50: 132 1st Qu.:2007-08-08
## TRUE :1819 3D4D3366260257624AB272D: 116 Median :2008-09-21
## 6A3B336601725506917317E: 94 Mean :2009-10-27
## FE113364863511529673D04: 58 3rd Qu.:2012-03-06
## (Other) :1306 Max. :2014-01-06
## NA's :6263
## CreditScoreRangeLower CreditScoreRangeUpper FirstRecordedCreditLine
## Min. : 0.0 Min. : 19.0 Min. :1951-07-04
## 1st Qu.:620.0 1st Qu.:639.0 1st Qu.:1989-12-08
## Median :660.0 Median :679.0 Median :1995-06-07
## Mean :651.4 Mean :670.4 Mean :1994-06-13
## 3rd Qu.:700.0 3rd Qu.:719.0 3rd Qu.:1999-09-30
## Max. :880.0 Max. :899.0 Max. :2011-08-10
## NA's :62 NA's :62 NA's :73
## CurrentCreditLines OpenCreditLines TotalCreditLinespast7years
## Min. : 0.000 Min. : 0.000 Min. : 2.00
## 1st Qu.: 5.000 1st Qu.: 4.000 1st Qu.: 15.00
## Median : 8.000 Median : 7.000 Median : 24.00
## Mean : 8.959 Mean : 7.741 Mean : 25.74
## 3rd Qu.:12.000 3rd Qu.:10.000 3rd Qu.: 34.00
## Max. :52.000 Max. :48.000 Max. :108.00
## NA's :1038 NA's :1038 NA's :73
## OpenRevolvingAccounts OpenRevolvingMonthlyPayment InquiriesLast6Months
## Min. : 0.000 Min. : 0.0 Min. : 0.000
## 1st Qu.: 2.000 1st Qu.: 45.0 1st Qu.: 0.000
## Median : 5.000 Median : 151.0 Median : 1.000
## Mean : 5.416 Mean : 300.5 Mean : 2.416
## 3rd Qu.: 8.000 3rd Qu.: 378.0 3rd Qu.: 3.000
## Max. :40.000 Max. :14985.0 Max. :105.000
## NA's :73
## TotalInquiries CurrentDelinquencies AmountDelinquent
## Min. : 0.000 Min. : 0.000 Min. : 0.00
## 1st Qu.: 3.000 1st Qu.: 0.000 1st Qu.: 0.00
## Median : 5.000 Median : 0.000 Median : 0.00
## Mean : 8.075 Mean : 1.226 Mean : 1353.47
## 3rd Qu.: 10.000 3rd Qu.: 1.000 3rd Qu.: 66.25
## Max. :379.000 Max. :64.000 Max. :215315.00
## NA's :131 NA's :73 NA's :1038
## DelinquenciesLast7Years PublicRecordsLast10Years
## Min. : 0.000 Min. : 0.0000
## 1st Qu.: 0.000 1st Qu.: 0.0000
## Median : 0.000 Median : 0.0000
## Mean : 6.469 Mean : 0.4211
## 3rd Qu.: 8.000 3rd Qu.: 1.0000
## Max. :99.000 Max. :17.0000
## NA's :118 NA's :73
## PublicRecordsLast12Months RevolvingCreditBalance BankcardUtilization
## Min. :0.0000 Min. : 0 Min. :0.0000
## 1st Qu.:0.0000 1st Qu.: 1072 1st Qu.:0.2500
## Median :0.0000 Median : 4540 Median :0.6400
## Mean :0.0363 Mean : 13469 Mean :0.5708
## 3rd Qu.:0.0000 3rd Qu.: 13615 3rd Qu.:0.8900
## Max. :7.0000 Max. :493300 Max. :5.8300
## NA's :1038 NA's :1038 NA's :1038
## AvailableBankcardCredit TotalTrades TradesNeverDelinquent.per
## Min. : 0 Min. : 1.00 Min. :0.0000
## 1st Qu.: 192 1st Qu.: 12.00 1st Qu.:0.7000
## Median : 1492 Median : 19.00 Median :0.8500
## Mean : 6211 Mean : 21.46 Mean :0.8083
## 3rd Qu.: 6298 3rd Qu.: 29.00 3rd Qu.:0.9700
## Max. :498374 Max. :102.00 Max. :1.0000
## NA's :1028 NA's :1028 NA's :1028
## TradesOpenedLast6Months DebtToIncomeRatio IncomeRange
## Min. : 0.0000 Min. : 0.0000 $25,000-49,999:2536
## 1st Qu.: 0.0000 1st Qu.: 0.1400 $50,000-74,999:1856
## Median : 1.0000 Median : 0.2100 Not displayed :1055
## Mean : 0.9891 Mean : 0.3117 $75,000-99,999: 920
## 3rd Qu.: 2.0000 3rd Qu.: 0.3200 $100,000+ : 824
## Max. :13.0000 Max. :10.0100 $1-24,999 : 793
## NA's :1028 NA's :666 (Other) : 182
## IncomeVerifiable StatedMonthlyIncome LoanKey
## Mode :logical Min. : 0 18FD3697424163510882853: 2
## FALSE:673 1st Qu.: 2667 BD5536972515798992B29C9: 2
## TRUE :7493 Median : 4083 FAF336950776856532E5CFC: 2
## Mean : 5068 00023650503696810C531F7: 1
## 3rd Qu.: 6000 000B3366346245964D6187E: 1
## Max. :1750003 001336793077504887041A4: 1
## (Other) :8157
## TotalProsperLoans TotalProsperPaymentsBilled OnTimeProsperPayments
## Min. :1.0 Min. : 0.00 Min. : 0.00
## 1st Qu.:1.0 1st Qu.: 9.00 1st Qu.: 8.00
## Median :1.0 Median : 14.00 Median : 13.00
## Mean :1.3 Mean : 21.41 Mean : 19.56
## 3rd Qu.:1.0 3rd Qu.: 32.00 3rd Qu.: 27.00
## Max. :5.0 Max. :123.00 Max. :103.00
## NA's :6629 NA's :6629 NA's :6629
## ProsperPaymentsLessThanOneMonthLate ProsperPaymentsOneMonthPlusLate
## Min. : 0.000 Min. : 0.00
## 1st Qu.: 0.000 1st Qu.: 0.00
## Median : 0.000 Median : 0.00
## Mean : 1.682 Mean : 0.17
## 3rd Qu.: 1.000 3rd Qu.: 0.00
## Max. :42.000 Max. :21.00
## NA's :6629 NA's :6629
## ProsperPrincipalBorrowed ProsperPrincipalOutstanding
## Min. : 1000 Min. : 0.00
## 1st Qu.: 3000 1st Qu.: 0.11
## Median : 5000 Median : 2095.15
## Mean : 6648 Mean : 2971.07
## 3rd Qu.: 8000 3rd Qu.: 4013.76
## Max. :41000 Max. :20946.73
## NA's :6629 NA's :6629
## ScorexChangeAtTimeOfListing LoanCurrentDaysDelinquent
## Min. :-180.000 Min. : 0.0
## 1st Qu.: -40.000 1st Qu.: 0.0
## Median : -6.000 Median : 222.5
## Mean : -9.108 Mean : 629.1
## 3rd Qu.: 17.000 3rd Qu.:1280.8
## Max. : 214.000 Max. :2408.0
## NA's :6655
## LoanFirstDefaultedCycleNumber LoanMonthsSinceOrigination LoanNumber
## Min. : 0.00 Min. : 2.00 Min. : 8
## 1st Qu.:11.00 1st Qu.: 24.00 1st Qu.: 18357
## Median :16.00 Median : 65.00 Median : 37433
## Mean :17.83 Mean : 52.17 Mean : 41578
## 3rd Qu.:25.00 3rd Qu.: 79.00 3rd Qu.: 61954
## Max. :41.00 Max. :100.00 Max. :126049
## NA's :3237
## LoanOriginalAmount LoanOriginationDate LoanOriginationQuarter
## Min. : 1000 Min. :2005-11-28 Q2 2008: 600
## 1st Qu.: 3000 1st Qu.:2007-08-27 Q2 2007: 530
## Median : 4760 Median :2008-10-06 Q2 2012: 510
## Mean : 6695 Mean :2009-11-11 Q3 2008: 493
## 3rd Qu.: 9000 3rd Qu.:2012-03-14 Q3 2012: 493
## Max. :35000 Max. :2014-01-13 Q1 2007: 450
## (Other):5090
## MemberKey MonthlyLoanPayment LP_CustomerPayments
## 0196338772112490878A367: 3 Min. : 0.0 Min. : -2.35
## 13F63388456517019BEEF82: 3 1st Qu.: 108.6 1st Qu.: 1450.01
## 2AFA35685188842990454C4: 3 Median : 173.7 Median : 3000.88
## 47DB3372200088144B54373: 3 Mean : 233.4 Mean : 4427.73
## 63D833653414495348BC9AA: 3 3rd Qu.: 314.2 3rd Qu.: 5741.77
## 72DC3382168310312DC5EE1: 3 Max. :1340.0 Max. :40547.70
## (Other) :8148
## LP_CustomerPrincipalPayments LP_InterestandFees LP_ServiceFees
## Min. : 0.0 Min. : -2.35 Min. :-664.87
## 1st Qu.: 661.7 1st Qu.: 635.39 1st Qu.: -83.36
## Median : 1589.1 Median : 1227.30 Median : -42.99
## Mean : 2797.0 Mean : 1630.69 Mean : -63.71
## 3rd Qu.: 3526.4 3rd Qu.: 2111.66 3rd Qu.: -20.33
## Max. :25087.7 Max. :15547.70 Max. : 32.06
##
## LP_CollectionFees LP_GrossPrincipalLoss LP_NetPrincipalLoss
## Min. :-9274.75 Min. : -94.2 Min. : -954.5
## 1st Qu.: -204.56 1st Qu.: 0.0 1st Qu.: 0.0
## Median : -83.61 Median : 1124.6 Median : 728.7
## Mean : -198.72 Mean : 2633.5 Mean : 2395.2
## 3rd Qu.: -36.06 3rd Qu.: 3677.9 3rd Qu.: 3304.8
## Max. : -0.40 Max. :25000.0 Max. :25000.0
##
## LP_NonPrincipalRecoverypayments PercentFunded Recommendations
## Min. : 0.0 Min. :0.7013 Min. : 0.00000
## 1st Qu.: 0.0 1st Qu.:1.0000 1st Qu.: 0.00000
## Median : 0.0 Median :1.0000 Median : 0.00000
## Mean : 281.7 Mean :0.9982 Mean : 0.07654
## 3rd Qu.: 130.3 3rd Qu.:1.0000 3rd Qu.: 0.00000
## Max. :21117.9 Max. :1.0000 Max. :16.00000
##
## InvestmentFromFriendsCount InvestmentFromFriendsAmount Investors
## Min. :0.00000 Min. : 0.0 Min. : 1.00
## 1st Qu.:0.00000 1st Qu.: 0.0 1st Qu.: 25.00
## Median :0.00000 Median : 0.0 Median : 57.00
## Mean :0.03355 Mean : 33.3 Mean : 91.74
## 3rd Qu.:0.00000 3rd Qu.: 0.0 3rd Qu.:121.00
## Max. :9.00000 Max. :15000.0 Max. :833.00
##
## Rating
## D :1876
## HR :1512
## C :1467
## E :1442
## B :1016
## A : 581
## (Other): 272
# loans with principal loss
summary(filter(data, LP_NetPrincipalLoss!=0))
## ListingKey ListingNumber ListingCreationDate
## 00003546482094282EF90E5: 1 Min. : 99 Min. :2006-01-25
## 00013542762124763F20254: 1 1st Qu.:131376 1st Qu.:2007-05-01
## 000433785890431972B4743: 1 Median :319181 Median :2008-04-27
## 0005353671687550573289D: 1 Mean :324530 Mean :2009-02-18
## 001035373445372274F74E2: 1 3rd Qu.:511888 3rd Qu.:2011-06-16
## 00143395229257559A91663: 1 Max. :932346 Max. :2013-09-26
## (Other) :16709
## CreditGrade Term LoanStatus
## HR :2110 Min. :12.00 Chargedoff :11982
## C :2009 1st Qu.:36.00 Defaulted : 4733
## D :2003 Median :36.00 Cancelled : 0
## E :1591 Mean :37.69 Completed : 0
## B :1384 3rd Qu.:36.00 Current : 0
## (Other):1426 Max. :60.00 FinalPaymentInProgress: 0
## NA's :6192 (Other) : 0
## ClosedDate BorrowerAPR BorrowerRate LenderYield
## Min. :2006-09-05 Min. :0.00864 Min. :0.0000 Min. :-0.0100
## 1st Qu.:2008-11-13 1st Qu.:0.18698 1st Qu.:0.1703 1st Qu.: 0.1610
## Median :2009-12-19 Median :0.25424 Median :0.2375 Median : 0.2250
## Mean :2010-07-23 Mean :0.25206 Mean :0.2317 Mean : 0.2209
## 3rd Qu.:2012-10-01 3rd Qu.:0.30781 3rd Qu.:0.2900 3rd Qu.: 0.2809
## Max. :2014-03-10 Max. :0.50633 Max. :0.4975 Max. : 0.4800
##
## EstimatedEffectiveYield EstimatedLoss EstimatedReturn
## Min. :-0.182 Min. :0.006 Min. :-0.182
## 1st Qu.: 0.159 1st Qu.:0.087 1st Qu.: 0.111
## Median : 0.235 Median :0.112 Median : 0.125
## Mean : 0.217 Mean :0.116 Mean : 0.123
## 3rd Qu.: 0.286 3rd Qu.:0.149 3rd Qu.: 0.144
## Max. : 0.320 Max. :0.366 Max. : 0.284
## NA's :10532 NA's :10532 NA's :10532
## ProsperRating.num ProsperRating.alpha ProsperScore ListingCategory.num
## Min. :1.000 D : 1633 Min. : 1.00 0 :6635
## 1st Qu.:2.000 HR : 1384 1st Qu.: 4.00 1 :4588
## Median :3.000 E : 1294 Median : 5.00 7 :1450
## Mean :2.906 C : 816 Mean : 5.42 3 :1387
## 3rd Qu.:4.000 B : 581 3rd Qu.: 7.00 2 : 792
## Max. :7.000 (Other): 475 Max. :10.00 4 : 760
## NA's :10532 NA's :10532 NA's :10532 (Other):1103
## BorrowerState Occupation EmploymentStatus
## CA :2269 Other :4670 Full-time :7416
## GA :1017 Professional :1671 Employed :4065
## IL :1012 Clerical : 699 Not available:2252
## FL : 969 Sales - Commission : 680 Self-employed:1131
## TX : 911 Administrative Assistant: 659 Other : 342
## (Other):8925 (Other) :7529 (Other) : 702
## NA's :1612 NA's : 807 NA's : 807
## EmploymentStatusDuration IsBorrowerHomeowner CurrentlyInGroup
## Min. : 0.00 Mode :logical Mode :logical
## 1st Qu.: 19.00 FALSE:9236 FALSE:12103
## Median : 51.00 TRUE :7479 TRUE :4612
## Mean : 80.11
## 3rd Qu.:112.00
## Max. :755.00
## NA's :3062
## GroupKey DateCreditPulled
## 783C3371218786870A73D20: 493 Min. :2005-12-11
## FEF83377364176536637E50: 310 1st Qu.:2007-04-24
## 3D4D3366260257624AB272D: 293 Median :2008-04-22
## 6A3B336601725506917317E: 275 Mean :2009-02-14
## FE113364863511529673D04: 180 3rd Qu.:2011-06-16
## (Other) : 3302 Max. :2013-09-26
## NA's :11862
## CreditScoreRangeLower CreditScoreRangeUpper FirstRecordedCreditLine
## Min. : 0.0 Min. : 19.0 Min. :1947-08-24
## 1st Qu.:600.0 1st Qu.:619.0 1st Qu.:1990-07-12
## Median :640.0 Median :659.0 Median :1995-09-01
## Mean :640.5 Mean :659.5 Mean :1994-09-26
## 3rd Qu.:700.0 3rd Qu.:719.0 3rd Qu.:1999-11-27
## Max. :860.0 Max. :879.0 Max. :2011-08-10
## NA's :173 NA's :173 NA's :232
## CurrentCreditLines OpenCreditLines TotalCreditLinespast7years
## Min. : 0.000 Min. : 0.000 Min. : 2.00
## 1st Qu.: 5.000 1st Qu.: 4.000 1st Qu.: 14.00
## Median : 8.000 Median : 7.000 Median : 23.00
## Mean : 9.274 Mean : 8.073 Mean : 25.03
## 3rd Qu.:13.000 3rd Qu.:11.000 3rd Qu.: 34.00
## Max. :52.000 Max. :51.000 Max. :129.00
## NA's :3060 NA's :3060 NA's :232
## OpenRevolvingAccounts OpenRevolvingMonthlyPayment InquiriesLast6Months
## Min. : 0.000 Min. : 0.0 Min. : 0.000
## 1st Qu.: 2.000 1st Qu.: 37.0 1st Qu.: 0.000
## Median : 5.000 Median : 152.0 Median : 2.000
## Mean : 5.595 Mean : 319.2 Mean : 3.008
## 3rd Qu.: 8.000 3rd Qu.: 400.0 3rd Qu.: 4.000
## Max. :51.000 Max. :14985.0 Max. :105.000
## NA's :232
## TotalInquiries CurrentDelinquencies AmountDelinquent
## Min. : 0.000 Min. : 0.000 Min. : 0
## 1st Qu.: 3.000 1st Qu.: 0.000 1st Qu.: 0
## Median : 6.000 Median : 0.000 Median : 0
## Mean : 9.583 Mean : 1.615 Mean : 1422
## 3rd Qu.: 13.000 3rd Qu.: 1.000 3rd Qu.: 106
## Max. :379.000 Max. :83.000 Max. :444745
## NA's :365 NA's :232 NA's :3062
## DelinquenciesLast7Years PublicRecordsLast10Years
## Min. : 0.000 Min. : 0.0000
## 1st Qu.: 0.000 1st Qu.: 0.0000
## Median : 0.000 Median : 0.0000
## Mean : 6.047 Mean : 0.4478
## 3rd Qu.: 7.000 3rd Qu.: 1.0000
## Max. :99.000 Max. :30.0000
## NA's :341 NA's :232
## PublicRecordsLast12Months RevolvingCreditBalance BankcardUtilization
## Min. :0.0000 Min. : 0 Min. :0.0000
## 1st Qu.:0.0000 1st Qu.: 1086 1st Qu.:0.2300
## Median :0.0000 Median : 5137 Median :0.6300
## Mean :0.0359 Mean : 15584 Mean :0.5659
## 3rd Qu.:0.0000 3rd Qu.: 15578 3rd Qu.:0.8900
## Max. :7.0000 Max. :600223 Max. :4.7300
## NA's :3060 NA's :3060 NA's :3060
## AvailableBankcardCredit TotalTrades TradesNeverDelinquent.per
## Min. : 0 Min. : 1.00 Min. :0.000
## 1st Qu.: 200 1st Qu.: 12.00 1st Qu.:0.710
## Median : 1778 Median : 19.00 Median :0.880
## Mean : 7162 Mean : 21.18 Mean :0.819
## 3rd Qu.: 7678 3rd Qu.: 29.00 3rd Qu.:1.000
## Max. :364284 Max. :118.00 Max. :1.000
## NA's :3043 NA's :3043 NA's :3043
## TradesOpenedLast6Months DebtToIncomeRatio IncomeRange
## Min. : 0.000 Min. : 0.0000 $25,000-49,999:5336
## 1st Qu.: 0.000 1st Qu.: 0.1381 $50,000-74,999:3442
## Median : 1.000 Median : 0.2200 Not displayed :3103
## Mean : 1.124 Mean : 0.3483 $1-24,999 :1634
## 3rd Qu.: 2.000 3rd Qu.: 0.3300 $75,000-99,999:1494
## Max. :17.000 Max. :10.0100 $100,000+ :1269
## NA's :3043 NA's :1474 (Other) : 437
## IncomeVerifiable StatedMonthlyIncome LoanKey
## Mode :logical Min. : 0 00023650503696810C531F7: 1
## FALSE:1495 1st Qu.: 2500 0004363753221955965B646: 1
## TRUE :15220 Median : 3750 000836579711360490B130B: 1
## Mean : 4452 000B3366346245964D6187E: 1
## 3rd Qu.: 5417 000B3656359179267F91999: 1
## Max. :208333 00193564075967640E1A9A1: 1
## (Other) :16709
## TotalProsperLoans TotalProsperPaymentsBilled OnTimeProsperPayments
## Min. :1.000 Min. : 0.00 Min. : 0.00
## 1st Qu.:1.000 1st Qu.: 8.00 1st Qu.: 8.00
## Median :1.000 Median : 12.00 Median : 12.00
## Mean :1.254 Mean : 18.59 Mean : 17.77
## 3rd Qu.:1.000 3rd Qu.: 25.00 3rd Qu.: 23.00
## Max. :7.000 Max. :103.00 Max. :101.00
## NA's :14324 NA's :14324 NA's :14324
## ProsperPaymentsLessThanOneMonthLate ProsperPaymentsOneMonthPlusLate
## Min. : 0.000 Min. :0.000
## 1st Qu.: 0.000 1st Qu.:0.000
## Median : 0.000 Median :0.000
## Mean : 0.753 Mean :0.061
## 3rd Qu.: 0.000 3rd Qu.:0.000
## Max. :26.000 Max. :8.000
## NA's :14324 NA's :14324
## ProsperPrincipalBorrowed ProsperPrincipalOutstanding
## Min. : 1000 Min. : 0.00
## 1st Qu.: 3000 1st Qu.: 0.17
## Median : 5000 Median : 2045.10
## Mean : 6682 Mean : 3055.53
## 3rd Qu.: 8500 3rd Qu.: 4179.37
## Max. :53200 Max. :22586.67
## NA's :14324 NA's :14324
## ScorexChangeAtTimeOfListing LoanCurrentDaysDelinquent
## Min. :-194.00 Min. : 16
## 1st Qu.: -40.00 1st Qu.: 292
## Median : -6.00 Median : 805
## Mean : -10.05 Mean :1032
## 3rd Qu.: 19.00 3rd Qu.:1781
## Max. : 214.00 Max. :2704
## NA's :14325
## LoanFirstDefaultedCycleNumber LoanMonthsSinceOrigination LoanNumber
## Min. : 0.00 Min. : 5.0 Min. : 29
## 1st Qu.: 9.00 1st Qu.:33.0 1st Qu.: 14850
## Median :14.00 Median :70.0 Median : 30773
## Mean :16.22 Mean :60.5 Mean : 33277
## 3rd Qu.:22.00 3rd Qu.:82.0 3rd Qu.: 50746
## Max. :44.00 Max. :98.0 Max. :103467
## NA's :13
## LoanOriginalAmount LoanOriginationDate LoanOriginationQuarter
## Min. : 1000 Min. :2006-01-27 Q2 2008:1444
## 1st Qu.: 2999 1st Qu.:2007-05-11 Q1 2007:1282
## Median : 4500 Median :2008-05-08 Q2 2007:1250
## Mean : 6449 Mean :2009-03-02 Q3 2008:1127
## 3rd Qu.: 8000 3rd Qu.:2011-06-30 Q3 2007:1032
## Max. :25000 Max. :2013-10-01 Q4 2006:1023
## (Other):9557
## MemberKey MonthlyLoanPayment LP_CustomerPayments
## 006C3373804016872128132: 2 Min. : 0.0 Min. : -2.35
## 009C35078002646985845CF: 2 1st Qu.: 108.6 1st Qu.: 676.90
## 00C43387968070538859D91: 2 Median : 173.7 Median : 1653.67
## 018B35275926204010E51B6: 2 Mean : 236.6 Mean : 2794.26
## 01D33386346150055C7F757: 2 3rd Qu.: 309.5 3rd Qu.: 3561.80
## 01DA3382241797159B9FE89: 2 Max. :1552.8 Max. :34021.80
## (Other) :16703
## LP_CustomerPrincipalPayments LP_InterestandFees LP_ServiceFees
## Min. : 0.0 Min. : -2.35 Min. :-664.87
## 1st Qu.: 297.7 1st Qu.: 321.14 1st Qu.: -59.02
## Median : 850.3 Median : 727.10 Median : -26.48
## Mean : 1696.4 Mean : 1097.89 Mean : -44.63
## 3rd Qu.: 2051.3 3rd Qu.: 1445.17 3rd Qu.: -10.54
## Max. :24596.4 Max. :14329.49 Max. : 32.06
##
## LP_CollectionFees LP_GrossPrincipalLoss LP_NetPrincipalLoss
## Min. :-9274.75 Min. : -94.2 Min. : -954.5
## 1st Qu.: -17.00 1st Qu.: 1841.3 1st Qu.: 1750.0
## Median : 0.00 Median : 3336.2 Median : 3247.4
## Mean : -67.23 Mean : 4747.8 Mean : 4644.9
## 3rd Qu.: 0.00 3rd Qu.: 6028.8 3rd Qu.: 5909.2
## Max. : 0.00 Max. :25000.0 Max. :25000.0
##
## LP_NonPrincipalRecoverypayments PercentFunded Recommendations
## Min. : 0.0 Min. :0.7012 Min. : 0.00000
## 1st Qu.: 0.0 1st Qu.:1.0000 1st Qu.: 0.00000
## Median : 0.0 Median :1.0000 Median : 0.00000
## Mean : 164.1 Mean :0.9982 Mean : 0.08268
## 3rd Qu.: 0.0 3rd Qu.:1.0000 3rd Qu.: 0.00000
## Max. :21117.9 Max. :1.0000 Max. :16.00000
##
## InvestmentFromFriendsCount InvestmentFromFriendsAmount Investors
## Min. :0.00000 Min. : 0.00 Min. : 1.00
## 1st Qu.:0.00000 1st Qu.: 0.00 1st Qu.: 27.00
## Median :0.00000 Median : 0.00 Median : 60.00
## Mean :0.03362 Mean : 27.64 Mean : 97.67
## 3rd Qu.:0.00000 3rd Qu.: 0.00 3rd Qu.:129.00
## Max. :9.00000 Max. :12500.00 Max. :881.00
##
## Rating
## D :3636
## HR :3494
## E :2885
## C :2825
## B :1965
## (Other):1901
## NA's : 9
# loans with non-principal recovery payments
summary(filter(data, LP_NonPrincipalRecoverypayments!=0))
## ListingKey ListingNumber ListingCreationDate
## 0005353671687550573289D: 1 Min. : 99 Min. :2006-01-25
## 001035373445372274F74E2: 1 1st Qu.:141414 1st Qu.:2007-05-22
## 00293413955892317967503: 1 Median :295332 Median :2008-03-18
## 00433419411531491904742: 1 Mean :299412 Mean :2008-10-07
## 005E35068034002701D1E8F: 1 3rd Qu.:448293 3rd Qu.:2010-03-01
## 007E35498620125415AF2FF: 1 Max. :813315 Max. :2013-06-18
## (Other) :3255
## CreditGrade Term LoanStatus
## C :492 Min. :12.0 Chargedoff :2048
## D :482 1st Qu.:36.0 Defaulted :1213
## B :354 Median :36.0 Cancelled : 0
## E :351 Mean :36.8 Completed : 0
## HR :336 3rd Qu.:36.0 Current : 0
## (Other):324 Max. :60.0 FinalPaymentInProgress: 0
## NA's :922 (Other) : 0
## ClosedDate BorrowerAPR BorrowerRate LenderYield
## Min. :2006-09-22 Min. :0.01315 Min. :0.0000 Min. :-0.0100
## 1st Qu.:2009-01-07 1st Qu.:0.17722 1st Qu.:0.1650 1st Qu.: 0.1530
## Median :2009-12-16 Median :0.24264 Median :0.2300 Median : 0.2169
## Mean :2010-05-21 Mean :0.24646 Mean :0.2275 Mean : 0.2167
## 3rd Qu.:2011-12-02 3rd Qu.:0.30564 3rd Qu.:0.2900 3rd Qu.: 0.2800
## Max. :2014-02-28 Max. :0.41355 Max. :0.3600 Max. : 0.3525
##
## EstimatedEffectiveYield EstimatedLoss EstimatedReturn
## Min. :-0.0508 Min. :0.0060 Min. :-0.0508
## 1st Qu.: 0.1496 1st Qu.:0.0890 1st Qu.: 0.1124
## Median : 0.2377 Median :0.1120 Median : 0.1304
## Mean : 0.2131 Mean :0.1189 Mean : 0.1260
## 3rd Qu.: 0.2861 3rd Qu.:0.1490 3rd Qu.: 0.1463
## Max. : 0.3199 Max. :0.3660 Max. : 0.2230
## NA's :2339 NA's :2339 NA's :2339
## ProsperRating.num ProsperRating.alpha ProsperScore
## Min. :1.000 D : 265 Min. : 1.000
## 1st Qu.:2.000 E : 210 1st Qu.: 4.000
## Median :3.000 HR : 206 Median : 5.000
## Mean :2.807 C : 110 Mean : 5.397
## 3rd Qu.:4.000 B : 67 3rd Qu.: 7.000
## Max. :7.000 (Other): 64 Max. :10.000
## NA's :2339 NA's :2339 NA's :2339
## ListingCategory.num BorrowerState Occupation
## 0 :1342 CA : 397 Other : 824
## 1 : 899 GA : 211 Professional : 372
## 7 : 284 IL : 204 Teacher : 156
## 3 : 225 FL : 134 Clerical : 141
## 4 : 198 NY : 134 Sales - Commission: 134
## 2 : 136 (Other):1834 (Other) :1521
## (Other): 177 NA's : 347 NA's : 113
## EmploymentStatus EmploymentStatusDuration IsBorrowerHomeowner
## Full-time :1839 Min. : 0.0 Mode :logical
## Employed : 546 1st Qu.: 20.0 FALSE:1711
## Not available: 368 Median : 53.0 TRUE :1550
## Self-employed: 195 Mean : 83.1
## Part-time : 77 3rd Qu.:117.0
## (Other) : 123 Max. :573.0
## NA's : 113 NA's :482
## CurrentlyInGroup GroupKey DateCreditPulled
## Mode :logical 783C3371218786870A73D20: 87 Min. :2005-12-11
## FALSE:2296 FEF83377364176536637E50: 78 1st Qu.:2007-05-16
## TRUE :965 3D4D3366260257624AB272D: 43 Median :2008-03-14
## 9BBE337094173775621CD34: 43 Mean :2008-10-03
## FE113364863511529673D04: 34 3rd Qu.:2010-02-22
## (Other) : 703 Max. :2013-06-18
## NA's :2273
## CreditScoreRangeLower CreditScoreRangeUpper FirstRecordedCreditLine
## Min. : 0.0 Min. : 19.0 Min. :1958-02-01
## 1st Qu.:600.0 1st Qu.:619.0 1st Qu.:1989-04-01
## Median :640.0 Median :659.0 Median :1994-09-21
## Mean :642.7 Mean :661.7 Mean :1993-10-05
## 3rd Qu.:700.0 3rd Qu.:719.0 3rd Qu.:1999-01-07
## Max. :860.0 Max. :879.0 Max. :2010-12-02
## NA's :24 NA's :24 NA's :28
## CurrentCreditLines OpenCreditLines TotalCreditLinespast7years
## Min. : 0.000 Min. : 0.000 Min. : 2.00
## 1st Qu.: 5.000 1st Qu.: 4.000 1st Qu.: 15.00
## Median : 8.000 Median : 7.000 Median : 25.00
## Mean : 9.275 Mean : 7.878 Mean : 26.67
## 3rd Qu.:13.000 3rd Qu.:11.000 3rd Qu.: 36.00
## Max. :45.000 Max. :39.000 Max. :120.00
## NA's :481 NA's :481 NA's :28
## OpenRevolvingAccounts OpenRevolvingMonthlyPayment InquiriesLast6Months
## Min. : 0.000 Min. : 0.0 Min. : 0.000
## 1st Qu.: 2.000 1st Qu.: 40.0 1st Qu.: 1.000
## Median : 5.000 Median : 144.0 Median : 2.000
## Mean : 5.505 Mean : 319.7 Mean : 2.962
## 3rd Qu.: 8.000 3rd Qu.: 391.0 3rd Qu.: 4.000
## Max. :40.000 Max. :5853.0 Max. :97.000
## NA's :28
## TotalInquiries CurrentDelinquencies AmountDelinquent
## Min. : 0.000 Min. : 0.000 Min. : 0
## 1st Qu.: 3.000 1st Qu.: 0.000 1st Qu.: 0
## Median : 7.000 Median : 0.000 Median : 0
## Mean : 9.803 Mean : 1.393 Mean : 1347
## 3rd Qu.: 13.000 3rd Qu.: 1.000 3rd Qu.: 179
## Max. :379.000 Max. :64.000 Max. :215315
## NA's :52 NA's :28 NA's :481
## DelinquenciesLast7Years PublicRecordsLast10Years
## Min. : 0.000 Min. : 0.0000
## 1st Qu.: 0.000 1st Qu.: 0.0000
## Median : 1.000 Median : 0.0000
## Mean : 7.099 Mean : 0.4476
## 3rd Qu.: 9.000 3rd Qu.: 1.0000
## Max. :99.000 Max. :17.0000
## NA's :47 NA's :28
## PublicRecordsLast12Months RevolvingCreditBalance BankcardUtilization
## Min. :0.0000 Min. : 0.0 Min. :0.0000
## 1st Qu.:0.0000 1st Qu.: 858.8 1st Qu.:0.2375
## Median :0.0000 Median : 4347.5 Median :0.6500
## Mean :0.0432 Mean : 14841.3 Mean :0.5777
## 3rd Qu.:0.0000 3rd Qu.: 14215.0 3rd Qu.:0.9100
## Max. :7.0000 Max. :493300.0 Max. :4.7300
## NA's :481 NA's :481 NA's :481
## AvailableBankcardCredit TotalTrades TradesNeverDelinquent.per
## Min. : 0.0 Min. : 1.00 Min. :0.0000
## 1st Qu.: 126.2 1st Qu.: 12.00 1st Qu.:0.6600
## Median : 1132.5 Median : 20.00 Median :0.8200
## Mean : 6107.1 Mean : 22.34 Mean :0.7819
## 3rd Qu.: 6000.0 3rd Qu.: 30.75 3rd Qu.:0.9500
## Max. :364284.0 Max. :118.00 Max. :1.0000
## NA's :475 NA's :475 NA's :475
## TradesOpenedLast6Months DebtToIncomeRatio IncomeRange
## Min. : 0.000 Min. : 0.0100 $25,000-49,999:1036
## 1st Qu.: 0.000 1st Qu.: 0.1400 $50,000-74,999: 730
## Median : 1.000 Median : 0.2200 Not displayed : 494
## Mean : 1.107 Mean : 0.3263 $75,000-99,999: 344
## 3rd Qu.: 2.000 3rd Qu.: 0.3400 $1-24,999 : 324
## Max. :13.000 Max. :10.0100 $100,000+ : 259
## NA's :475 NA's :224 (Other) : 74
## IncomeVerifiable StatedMonthlyIncome LoanKey
## Mode :logical Min. : 0 000B3366346245964D6187E: 1
## FALSE:228 1st Qu.: 2625 002E362825862155611E637: 1
## TRUE :3033 Median : 4000 0038364851207507437AE49: 1
## Mean : 4620 005D36460825304664D7BC1: 1
## 3rd Qu.: 5792 00803393723192256BBDBA4: 1
## Max. :208333 008A366194593193999BEE4: 1
## (Other) :3255
## TotalProsperLoans TotalProsperPaymentsBilled OnTimeProsperPayments
## Min. :1.000 Min. : 0.00 Min. : 0.00
## 1st Qu.:1.000 1st Qu.: 8.00 1st Qu.: 7.00
## Median :1.000 Median :12.00 Median :12.00
## Mean :1.177 Mean :17.82 Mean :16.67
## 3rd Qu.:1.000 3rd Qu.:23.00 3rd Qu.:21.00
## Max. :5.000 Max. :90.00 Max. :90.00
## NA's :2799 NA's :2799 NA's :2799
## ProsperPaymentsLessThanOneMonthLate ProsperPaymentsOneMonthPlusLate
## Min. : 0.000 Min. :0.0000
## 1st Qu.: 0.000 1st Qu.:0.0000
## Median : 0.000 Median :0.0000
## Mean : 1.069 Mean :0.0823
## 3rd Qu.: 1.000 3rd Qu.:0.0000
## Max. :24.000 Max. :7.0000
## NA's :2799 NA's :2799
## ProsperPrincipalBorrowed ProsperPrincipalOutstanding
## Min. : 1000 Min. : 0.0
## 1st Qu.: 2550 1st Qu.: 321.6
## Median : 4022 Median : 1982.3
## Mean : 5702 Mean : 2746.8
## 3rd Qu.: 7150 3rd Qu.: 3916.9
## Max. :30000 Max. :20946.7
## NA's :2799 NA's :2799
## ScorexChangeAtTimeOfListing LoanCurrentDaysDelinquent
## Min. :-160.0000 Min. : 16
## 1st Qu.: -36.0000 1st Qu.: 368
## Median : 0.0000 Median : 951
## Mean : 0.8351 Mean :1042
## 3rd Qu.: 37.0000 3rd Qu.:1688
## Max. : 214.0000 Max. :2613
## NA's :2800
## LoanFirstDefaultedCycleNumber LoanMonthsSinceOrigination LoanNumber
## Min. : 0.00 Min. : 9.00 Min. : 29
## 1st Qu.: 8.00 1st Qu.:48.00 1st Qu.:15589
## Median :14.00 Median :72.00 Median :29091
## Mean :15.79 Mean :64.91 Mean :29839
## 3rd Qu.:23.00 3rd Qu.:81.00 3rd Qu.:41253
## Max. :41.00 Max. :98.00 Max. :93727
##
## LoanOriginalAmount LoanOriginationDate LoanOriginationQuarter
## Min. : 1000 Min. :2006-01-27 Q2 2008: 375
## 1st Qu.: 3000 1st Qu.:2007-06-01 Q2 2007: 310
## Median : 5000 Median :2008-03-31 Q3 2008: 303
## Mean : 6575 Mean :2008-10-19 Q1 2008: 255
## 3rd Qu.: 8000 3rd Qu.:2010-03-17 Q3 2007: 240
## Max. :25000 Max. :2013-06-20 Q1 2007: 233
## (Other):1545
## MemberKey MonthlyLoanPayment LP_CustomerPayments
## 00C43387968070538859D91: 2 Min. : 0.0 Min. : 0
## 030E3403407292850D85CBC: 2 1st Qu.: 105.1 1st Qu.: 962
## 03863429108114327FA5713: 2 Median : 173.7 Median : 2169
## 03F43394048903402B10A91: 2 Mean : 235.3 Mean : 3456
## 04583480851747351D652F4: 2 3rd Qu.: 309.3 3rd Qu.: 4365
## 049233823972095378B0176: 2 Max. :1130.9 Max. :34022
## (Other) :3249
## LP_CustomerPrincipalPayments LP_InterestandFees LP_ServiceFees
## Min. : 0.0 Min. : 0.0 Min. :-664.87
## 1st Qu.: 417.9 1st Qu.: 453.3 1st Qu.: -73.69
## Median : 1121.9 Median : 937.6 Median : -35.93
## Mean : 2125.9 Mean : 1330.4 Mean : -56.06
## 3rd Qu.: 2535.7 3rd Qu.: 1713.5 3rd Qu.: -16.02
## Max. :24939.2 Max. :14329.5 Max. : 32.06
##
## LP_CollectionFees LP_GrossPrincipalLoss LP_NetPrincipalLoss
## Min. :-9274.75 Min. : -94.2 Min. : -954.5
## 1st Qu.: -372.00 1st Qu.: 1663.6 1st Qu.: 939.7
## Median : -104.77 Median : 3109.4 Median : 2560.9
## Mean : -300.52 Mean : 4434.9 Mean : 3759.6
## 3rd Qu.: -17.52 3rd Qu.: 5669.3 3rd Qu.: 4850.2
## Max. : 0.00 Max. :25000.0 Max. :25000.0
##
## LP_NonPrincipalRecoverypayments PercentFunded Recommendations
## Min. : 0.04 Min. :0.7013 Min. : 0.00000
## 1st Qu.: 132.00 1st Qu.:1.0000 1st Qu.: 0.00000
## Median : 387.82 Median :1.0000 Median : 0.00000
## Mean : 878.47 Mean :0.9985 Mean : 0.09874
## 3rd Qu.: 1025.00 3rd Qu.:1.0000 3rd Qu.: 0.00000
## Max. :21117.90 Max. :1.0000 Max. :16.00000
##
## InvestmentFromFriendsCount InvestmentFromFriendsAmount Investors
## Min. :0.00000 Min. : 0.00 Min. : 1.0
## 1st Qu.:0.00000 1st Qu.: 0.00 1st Qu.: 32.0
## Median :0.00000 Median : 0.00 Median : 66.0
## Mean :0.04293 Mean : 35.84 Mean :106.6
## 3rd Qu.:0.00000 3rd Qu.: 0.00 3rd Qu.:142.0
## Max. :7.00000 Max. :12500.00 Max. :821.0
##
## Rating
## D :747
## C :602
## E :561
## HR :542
## B :421
## A :248
## (Other):140
# closed loans with no non-principal recovery payments
summary(filter(data, !is.na(ClosedDate) &
LP_NonPrincipalRecoverypayments==0))
## ListingKey ListingNumber ListingCreationDate
## 018A360063948152589C8BE: 2 Min. : 4 Min. :2005-11-09
## 30F435938764424435A1188: 2 1st Qu.: 190746 1st Qu.:2007-08-21
## 32943590099161153292459: 2 Median : 396320 Median :2008-09-10
## 6DFC3591891372387BB41B2: 2 Mean : 373438 Mean :2009-08-01
## 778D35919242972923313E0: 2 3rd Qu.: 528024 3rd Qu.:2011-09-19
## 82FD35914405776692938D4: 2 Max. :1204824 Max. :2014-02-13
## (Other) :51816
## CreditGrade Term LoanStatus
## C : 5157 Min. :12.00 Completed :38074
## D : 4671 1st Qu.:36.00 Chargedoff : 9944
## B : 4035 Median :36.00 Defaulted : 3805
## AA : 3385 Mean :36.95 Cancelled : 5
## HR : 3172 3rd Qu.:36.00 Current : 0
## (Other): 6194 Max. :60.00 FinalPaymentInProgress: 0
## NA's :25214 (Other) : 0
## ClosedDate BorrowerAPR BorrowerRate LenderYield
## Min. :2005-11-25 Min. :0.00653 Min. :0.0000 Min. :-0.0100
## 1st Qu.:2009-08-06 1st Qu.:0.14709 1st Qu.:0.1315 1st Qu.: 0.1225
## Median :2011-05-03 Median :0.21223 Median :0.1900 Median : 0.1800
## Mean :2011-03-26 Mean :0.22067 Mean :0.1987 Mean : 0.1886
## 3rd Qu.:2013-02-14 3rd Qu.:0.29510 3rd Qu.:0.2669 3rd Qu.: 0.2545
## Max. :2014-03-10 Max. :0.51229 Max. :0.4975 Max. : 0.4925
## NA's :25
## EstimatedEffectiveYield EstimatedLoss EstimatedReturn
## Min. :-0.183 Min. :0.005 Min. :-0.183
## 1st Qu.: 0.108 1st Qu.:0.050 1st Qu.: 0.077
## Median : 0.170 Median :0.096 Median : 0.112
## Mean : 0.175 Mean :0.093 Mean : 0.107
## 3rd Qu.: 0.245 3rd Qu.:0.140 3rd Qu.: 0.136
## Max. : 0.320 Max. :0.366 Max. : 0.284
## NA's :26745 NA's :26745 NA's :26745
## ProsperRating.num ProsperRating.alpha ProsperScore
## Min. :1.000 D : 5604 Min. : 1.000
## 1st Qu.:2.000 C : 3707 1st Qu.: 5.000
## Median :3.000 E : 3620 Median : 6.000
## Mean :3.694 A : 3552 Mean : 6.298
## 3rd Qu.:5.000 HR : 3519 3rd Qu.: 8.000
## Max. :7.000 (Other): 5081 Max. :11.000
## NA's :26745 NA's :26745 NA's :26745
## ListingCategory.num BorrowerState Occupation
## 1 :16969 CA : 6866 Other :13232
## 0 :15610 FL : 2944 Professional : 6143
## 7 : 5758 IL : 2835 Computer Programmer : 2399
## 3 : 4032 TX : 2638 Administrative Assistant: 1807
## 2 : 3108 GA : 2572 Analyst : 1715
## 4 : 2197 (Other):28805 (Other) :24377
## (Other): 4154 NA's : 5168 NA's : 2155
## EmploymentStatus EmploymentStatusDuration IsBorrowerHomeowner
## Full-time :23119 Min. : 0.00 Mode :logical
## Employed :15945 1st Qu.: 21.00 FALSE:27491
## Not available: 4979 Median : 52.00 TRUE :24337
## Self-employed: 2731 Mean : 80.76
## Part-time : 979 3rd Qu.:112.00
## (Other) : 1933 Max. :755.00
## NA's : 2142 NA's :7133
## CurrentlyInGroup GroupKey DateCreditPulled
## Mode :logical 783C3371218786870A73D20: 974 Min. :2005-11-09
## FALSE:40949 3D4D3366260257624AB272D: 763 1st Qu.:2007-08-15
## TRUE :10879 6A3B336601725506917317E: 641 Median :2008-09-09
## FEF83377364176536637E50: 504 Mean :2009-08-01
## C9643379247860156A00EC0: 319 3rd Qu.:2011-09-19
## (Other) : 8480 Max. :2014-02-13
## NA's :40147
## CreditScoreRangeLower CreditScoreRangeUpper FirstRecordedCreditLine
## Min. : 0.0 Min. : 19.0 Min. :1947-08-24
## 1st Qu.:640.0 1st Qu.:659.0 1st Qu.:1990-10-31
## Median :680.0 Median :699.0 Median :1995-11-01
## Mean :673.6 Mean :692.6 Mean :1994-12-27
## 3rd Qu.:720.0 3rd Qu.:739.0 3rd Qu.:2000-01-25
## Max. :880.0 Max. :899.0 Max. :2012-06-19
## NA's :567 NA's :567 NA's :669
## CurrentCreditLines OpenCreditLines TotalCreditLinespast7years
## Min. : 0.000 Min. : 0.000 Min. : 2.00
## 1st Qu.: 6.000 1st Qu.: 5.000 1st Qu.: 15.00
## Median : 9.000 Median : 8.000 Median : 23.00
## Mean : 9.587 Mean : 8.366 Mean : 25.19
## 3rd Qu.:13.000 3rd Qu.:11.000 3rd Qu.: 33.00
## Max. :59.000 Max. :51.000 Max. :136.00
## NA's :7123 NA's :7123 NA's :669
## OpenRevolvingAccounts OpenRevolvingMonthlyPayment InquiriesLast6Months
## Min. : 0.000 Min. : 0.0 Min. : 0.000
## 1st Qu.: 3.000 1st Qu.: 61.0 1st Qu.: 0.000
## Median : 5.000 Median : 185.0 Median : 1.000
## Mean : 6.114 Mean : 325.6 Mean : 1.994
## 3rd Qu.: 8.000 3rd Qu.: 419.0 3rd Qu.: 3.000
## Max. :51.000 Max. :14985.0 Max. :105.000
## NA's :669
## TotalInquiries CurrentDelinquencies AmountDelinquent
## Min. : 0 Min. : 0.0000 Min. : 0
## 1st Qu.: 2 1st Qu.: 0.0000 1st Qu.: 0
## Median : 5 Median : 0.0000 Median : 0
## Mean : 7 Mean : 0.8757 Mean : 1032
## 3rd Qu.: 9 3rd Qu.: 1.0000 3rd Qu.: 0
## Max. :377 Max. :83.0000 Max. :444745
## NA's :1107 NA's :669 NA's :7141
## DelinquenciesLast7Years PublicRecordsLast10Years
## Min. : 0.000 Min. : 0.0000
## 1st Qu.: 0.000 1st Qu.: 0.0000
## Median : 0.000 Median : 0.0000
## Mean : 4.412 Mean : 0.3237
## 3rd Qu.: 4.000 3rd Qu.: 0.0000
## Max. :99.000 Max. :30.0000
## NA's :943 NA's :669
## PublicRecordsLast12Months RevolvingCreditBalance BankcardUtilization
## Min. :0.000 Min. : 0 Min. :0.000
## 1st Qu.:0.000 1st Qu.: 1677 1st Qu.:0.210
## Median :0.000 Median : 6170 Median :0.560
## Mean :0.023 Mean : 15742 Mean :0.527
## 3rd Qu.:0.000 3rd Qu.: 16306 3rd Qu.:0.840
## Max. :7.000 Max. :1435667 Max. :5.950
## NA's :7123 NA's :7123 NA's :7123
## AvailableBankcardCredit TotalTrades TradesNeverDelinquent.per
## Min. : 0 Min. : 0.00 Min. :0.000
## 1st Qu.: 546 1st Qu.: 13.00 1st Qu.:0.780
## Median : 3439 Median : 20.00 Median :0.920
## Mean : 11139 Mean : 21.75 Mean :0.861
## 3rd Qu.: 12452 3rd Qu.: 29.00 3rd Qu.:1.000
## Max. :646285 Max. :126.00 Max. :1.000
## NA's :7069 NA's :7069 NA's :7069
## TradesOpenedLast6Months DebtToIncomeRatio IncomeRange
## Min. : 0.000 Min. : 0.000 $25,000-49,999:15308
## 1st Qu.: 0.000 1st Qu.: 0.130 $50,000-74,999:12059
## Median : 1.000 Median : 0.200 Not displayed : 7247
## Mean : 0.897 Mean : 0.288 $75,000-99,999: 6098
## 3rd Qu.: 1.000 3rd Qu.: 0.300 $100,000+ : 5805
## Max. :20.000 Max. :10.010 $1-24,999 : 4247
## NA's :7069 NA's :4006 (Other) : 1064
## IncomeVerifiable StatedMonthlyIncome LoanKey
## Mode :logical Min. : 0 08C43696561586194AC381C: 2
## FALSE:4073 1st Qu.: 2833 09303699897852595CD59DD: 2
## TRUE :47755 Median : 4167 114D37056655628721BD6C8: 2
## Mean : 5082 156836977849742636AE34F: 2
## 3rd Qu.: 6250 56D73700259224545E36FBC: 2
## Max. :618548 63113695530739927C7EA06: 2
## (Other) :51816
## TotalProsperLoans TotalProsperPaymentsBilled OnTimeProsperPayments
## Min. :0.00 Min. : 0.00 Min. : 0.00
## 1st Qu.:1.00 1st Qu.: 9.00 1st Qu.: 9.00
## Median :1.00 Median : 15.00 Median : 15.00
## Mean :1.34 Mean : 20.18 Mean : 19.62
## 3rd Qu.:2.00 3rd Qu.: 29.00 3rd Qu.: 28.00
## Max. :7.00 Max. :120.00 Max. :114.00
## NA's :41751 NA's :41751 NA's :41751
## ProsperPaymentsLessThanOneMonthLate ProsperPaymentsOneMonthPlusLate
## Min. : 0.00 Min. : 0.00
## 1st Qu.: 0.00 1st Qu.: 0.00
## Median : 0.00 Median : 0.00
## Mean : 0.52 Mean : 0.05
## 3rd Qu.: 0.00 3rd Qu.: 0.00
## Max. :42.00 Max. :21.00
## NA's :41751 NA's :41751
## ProsperPrincipalBorrowed ProsperPrincipalOutstanding
## Min. : 0 Min. : 0
## 1st Qu.: 3000 1st Qu.: 0
## Median : 5000 Median : 1043
## Mean : 7169 Mean : 2314
## 3rd Qu.: 9500 3rd Qu.: 3362
## Max. :60001 Max. :22587
## NA's :41751 NA's :41751
## ScorexChangeAtTimeOfListing LoanCurrentDaysDelinquent
## Min. :-194.00 Min. : 0.0
## 1st Qu.: -32.00 1st Qu.: 0.0
## Median : 0.00 Median : 0.0
## Mean : 1.16 Mean : 268.7
## 3rd Qu.: 32.00 3rd Qu.: 154.0
## Max. : 286.00 Max. :2704.0
## NA's :41847
## LoanFirstDefaultedCycleNumber LoanMonthsSinceOrigination LoanNumber
## Min. : 1.00 Min. : 1.00 Min. : 1
## 1st Qu.:10.00 1st Qu.: 30.00 1st Qu.: 18724
## Median :14.00 Median : 66.00 Median : 37028
## Mean :16.38 Mean : 55.05 Mean : 38445
## 3rd Qu.:22.00 3rd Qu.: 78.00 3rd Qu.: 53926
## Max. :44.00 Max. :100.00 Max. :132453
## NA's :38145
## LoanOriginalAmount LoanOriginationDate LoanOriginationQuarter
## Min. : 1000 Min. :2005-11-15 Q2 2008: 3969
## 1st Qu.: 2600 1st Qu.:2007-09-04 Q3 2008: 3299
## Median : 4500 Median :2008-09-25 Q1 2007: 2846
## Mean : 6242 Mean :2009-08-15 Q1 2008: 2819
## 3rd Qu.: 8000 3rd Qu.:2011-09-29 Q2 2007: 2808
## Max. :35000 Max. :2014-02-21 Q3 2007: 2431
## (Other):33656
## MemberKey MonthlyLoanPayment LP_CustomerPayments
## 16083364744933457E57FB9: 8 Min. : 0.00 Min. : -2.35
## 63CA34120866140639431C9: 8 1st Qu.: 98.03 1st Qu.: 2111.43
## 739C338135235294782AE75: 8 Median : 172.57 Median : 4354.66
## 7E1733653050264822FAA3D: 8 Mean : 222.99 Mean : 6082.62
## C70934206057523078260C7: 8 3rd Qu.: 299.71 3rd Qu.: 8166.37
## 458E33818543661332BC1BE: 7 Max. :2251.51 Max. :40702.39
## (Other) :51781
## LP_CustomerPrincipalPayments LP_InterestandFees LP_ServiceFees
## Min. : 0 Min. : -2.35 Min. :-589.95
## 1st Qu.: 1500 1st Qu.: 324.20 1st Qu.: -72.99
## Median : 3500 Median : 753.92 Median : -34.16
## Mean : 4968 Mean : 1114.18 Mean : -53.51
## 3rd Qu.: 7000 3rd Qu.: 1496.82 3rd Qu.: -14.57
## Max. :35000 Max. :15617.03 Max. : 2.87
##
## LP_CollectionFees LP_GrossPrincipalLoss LP_NetPrincipalLoss
## Min. :-1996.860 Min. : 0.0 Min. : -474.3
## 1st Qu.: 0.000 1st Qu.: 0.0 1st Qu.: 0.0
## Median : 0.000 Median : 0.0 Median : 0.0
## Mean : -7.666 Mean : 1260.8 Mean : 1261.5
## 3rd Qu.: 0.000 3rd Qu.: 662.6 3rd Qu.: 664.3
## Max. : 0.000 Max. :25000.0 Max. :25000.0
##
## LP_NonPrincipalRecoverypayments PercentFunded Recommendations
## Min. :0 Min. :0.7000 Min. : 0.00000
## 1st Qu.:0 1st Qu.:1.0000 1st Qu.: 0.00000
## Median :0 Median :1.0000 Median : 0.00000
## Mean :0 Mean :0.9986 Mean : 0.08879
## 3rd Qu.:0 3rd Qu.:1.0000 3rd Qu.: 0.00000
## Max. :0 Max. :1.0110 Max. :39.00000
##
## InvestmentFromFriendsCount InvestmentFromFriendsAmount Investors
## Min. : 0.00000 Min. : 0.00 Min. : 1.0
## 1st Qu.: 0.00000 1st Qu.: 0.00 1st Qu.: 31.0
## Median : 0.00000 Median : 0.00 Median : 69.0
## Mean : 0.04631 Mean : 33.44 Mean : 104.8
## 3rd Qu.: 0.00000 3rd Qu.: 0.00 3rd Qu.: 142.0
## Max. :33.00000 Max. :25000.00 Max. :1189.0
##
## Rating
## D :10275
## C : 8864
## B : 7341
## HR : 6691
## A : 6675
## (Other):11851
## NA's : 131
# closed and completed loans with no non-principal recovery payments
summary(filter(data, !is.na(ClosedDate) &
LP_NonPrincipalRecoverypayments==0 &
LoanStatus=="Completed"))
## ListingKey ListingNumber ListingCreationDate
## 018A360063948152589C8BE: 2 Min. : 4 Min. :2005-11-09
## 30F435938764424435A1188: 2 1st Qu.: 221154 1st Qu.:2007-10-25
## 32943590099161153292459: 2 Median : 425476 Median :2009-09-22
## 6DFC3591891372387BB41B2: 2 Mean : 388420 Mean :2009-09-15
## 778D35919242972923313E0: 2 3rd Qu.: 529268 3rd Qu.:2011-09-24
## 82FD35914405776692938D4: 2 Max. :1204824 Max. :2014-02-13
## (Other) :38062
## CreditGrade Term LoanStatus
## C : 3609 Min. :12.00 Completed :38074
## D : 3126 1st Qu.:36.00 Cancelled : 0
## B : 2987 Median :36.00 Chargedoff : 0
## AA : 2969 Mean :36.61 Current : 0
## A : 2505 3rd Qu.:36.00 Defaulted : 0
## (Other): 3092 Max. :60.00 FinalPaymentInProgress: 0
## NA's :19786 (Other) : 0
## ClosedDate BorrowerAPR BorrowerRate LenderYield
## Min. :2005-11-25 Min. :0.00653 Min. :0.0000 Min. :-0.0100
## 1st Qu.:2010-01-28 1st Qu.:0.13271 1st Qu.:0.1173 1st Qu.: 0.1080
## Median :2011-07-14 Median :0.19479 Median :0.1744 Median : 0.1644
## Mean :2011-06-14 Mean :0.20878 Mean :0.1864 Mean : 0.1766
## 3rd Qu.:2013-03-06 3rd Qu.:0.28498 3rd Qu.:0.2511 3rd Qu.: 0.2411
## Max. :2014-03-04 Max. :0.51229 Max. :0.4975 Max. : 0.4925
## NA's :25
## EstimatedEffectiveYield EstimatedLoss EstimatedReturn
## Min. :-0.183 Min. :0.005 Min. :-0.183
## 1st Qu.: 0.093 1st Qu.:0.041 1st Qu.: 0.072
## Median : 0.154 Median :0.085 Median : 0.107
## Mean : 0.163 Mean :0.087 Mean : 0.102
## 3rd Qu.: 0.234 3rd Qu.:0.119 3rd Qu.: 0.132
## Max. : 0.320 Max. :0.366 Max. : 0.267
## NA's :18410 NA's :18410 NA's :18410
## ProsperRating.num ProsperRating.alpha ProsperScore
## Min. :1.000 D : 4192 Min. : 1.000
## 1st Qu.:3.000 A : 3203 1st Qu.: 5.000
## Median :4.000 C : 2977 Median : 7.000
## Mean :3.908 B : 2785 Mean : 6.537
## 3rd Qu.:5.000 E : 2506 3rd Qu.: 8.000
## Max. :7.000 (Other): 4001 Max. :11.000
## NA's :18410 NA's :18410 NA's :18410
## ListingCategory.num BorrowerState Occupation
## 1 :13167 CA : 4957 Other : 9315
## 0 :10253 FL : 2094 Professional : 4804
## 7 : 4551 IL : 2017 Computer Programmer: 2073
## 3 : 2848 NY : 1935 Analyst : 1451
## 2 : 2434 TX : 1831 Executive : 1294
## 4 : 1620 (Other):21357 (Other) :17685
## (Other): 3201 NA's : 3883 NA's : 1452
## EmploymentStatus EmploymentStatusDuration IsBorrowerHomeowner
## Full-time :17397 Min. : 0.00 Mode :logical
## Employed :12332 1st Qu.: 21.00 FALSE:19794
## Not available: 3077 Median : 53.00 TRUE :18280
## Self-employed: 1783 Mean : 81.26
## Part-time : 794 3rd Qu.:112.00
## (Other) : 1252 Max. :745.00
## NA's : 1439 NA's :4526
## CurrentlyInGroup GroupKey DateCreditPulled
## Mode :logical 783C3371218786870A73D20: 562 Min. :2005-11-09
## FALSE:30898 3D4D3366260257624AB272D: 512 1st Qu.:2007-10-22
## TRUE :7176 6A3B336601725506917317E: 396 Median :2009-10-01
## FEF83377364176536637E50: 268 Mean :2009-09-17
## CC8D33653247904019A9059: 258 3rd Qu.:2011-09-26
## (Other) : 5762 Max. :2014-02-13
## NA's :30316
## CreditScoreRangeLower CreditScoreRangeUpper FirstRecordedCreditLine
## Min. : 0.0 Min. : 19.0 Min. :1950-08-01
## 1st Qu.:640.0 1st Qu.:659.0 1st Qu.:1990-10-31
## Median :680.0 Median :699.0 Median :1995-11-01
## Mean :685.6 Mean :704.6 Mean :1994-12-30
## 3rd Qu.:740.0 3rd Qu.:759.0 3rd Qu.:2000-01-21
## Max. :880.0 Max. :899.0 Max. :2012-06-19
## NA's :416 NA's :416 NA's :463
## CurrentCreditLines OpenCreditLines TotalCreditLinespast7years
## Min. : 0.000 Min. : 0.00 Min. : 2.00
## 1st Qu.: 6.000 1st Qu.: 5.00 1st Qu.: 15.00
## Median : 9.000 Median : 8.00 Median : 23.00
## Mean : 9.692 Mean : 8.45 Mean : 25.39
## 3rd Qu.:13.000 3rd Qu.:11.00 3rd Qu.: 33.00
## Max. :59.000 Max. :48.00 Max. :136.00
## NA's :4517 NA's :4517 NA's :463
## OpenRevolvingAccounts OpenRevolvingMonthlyPayment InquiriesLast6Months
## Min. : 0.000 Min. : 0 Min. : 0.000
## 1st Qu.: 3.000 1st Qu.: 71 1st Qu.: 0.000
## Median : 5.000 Median : 196 Median : 1.000
## Mean : 6.294 Mean : 328 Mean : 1.633
## 3rd Qu.: 8.000 3rd Qu.: 425 3rd Qu.: 2.000
## Max. :49.000 Max. :12769 Max. :63.000
## NA's :463
## TotalInquiries CurrentDelinquencies AmountDelinquent
## Min. : 0.000 Min. : 0.0000 Min. : 0.0
## 1st Qu.: 2.000 1st Qu.: 0.0000 1st Qu.: 0.0
## Median : 4.000 Median : 0.0000 Median : 0.0
## Mean : 6.109 Mean : 0.5958 Mean : 895.4
## 3rd Qu.: 8.000 3rd Qu.: 0.0000 3rd Qu.: 0.0
## Max. :113.000 Max. :50.0000 Max. :327677.0
## NA's :786 NA's :463 NA's :4533
## DelinquenciesLast7Years PublicRecordsLast10Years
## Min. : 0.000 Min. : 0.0000
## 1st Qu.: 0.000 1st Qu.: 0.0000
## Median : 0.000 Median : 0.0000
## Mean : 3.925 Mean : 0.2801
## 3rd Qu.: 3.000 3rd Qu.: 0.0000
## Max. :99.000 Max. :21.0000
## NA's :644 NA's :463
## PublicRecordsLast12Months RevolvingCreditBalance BankcardUtilization
## Min. :0.000 Min. : 0 Min. :0.000
## 1st Qu.:0.000 1st Qu.: 1867 1st Qu.:0.210
## Median :0.000 Median : 6455 Median :0.540
## Mean :0.019 Mean : 15743 Mean :0.516
## 3rd Qu.:0.000 3rd Qu.: 16368 3rd Qu.:0.830
## Max. :4.000 Max. :1435667 Max. :5.950
## NA's :4517 NA's :4517 NA's :4517
## AvailableBankcardCredit TotalTrades TradesNeverDelinquent.per
## Min. : 0 Min. : 0.00 Min. :0.000
## 1st Qu.: 740 1st Qu.: 13.00 1st Qu.:0.800
## Median : 4111 Median : 20.00 Median :0.930
## Mean : 12385 Mean : 22.04 Mean :0.872
## 3rd Qu.: 14172 3rd Qu.: 29.00 3rd Qu.:1.000
## Max. :646285 Max. :126.00 Max. :1.000
## NA's :4474 NA's :4474 NA's :4474
## TradesOpenedLast6Months DebtToIncomeRatio IncomeRange
## Min. : 0.000 Min. : 0.0000 $25,000-49,999:10891
## 1st Qu.: 0.000 1st Qu.: 0.1200 $50,000-74,999: 9282
## Median : 0.000 Median : 0.1900 $75,000-99,999: 4914
## Mean : 0.822 Mean : 0.2642 $100,000+ : 4774
## 3rd Qu.: 1.000 3rd Qu.: 0.2900 Not displayed : 4610
## Max. :20.000 Max. :10.0100 $1-24,999 : 2908
## NA's :4474 NA's :2734 (Other) : 695
## IncomeVerifiable StatedMonthlyIncome LoanKey
## Mode :logical Min. : 0 08C43696561586194AC381C: 2
## FALSE:2782 1st Qu.: 2917 09303699897852595CD59DD: 2
## TRUE :35292 Median : 4417 114D37056655628721BD6C8: 2
## Mean : 5324 156836977849742636AE34F: 2
## 3rd Qu.: 6583 56D73700259224545E36FBC: 2
## Max. :618548 63113695530739927C7EA06: 2
## (Other) :38062
## TotalProsperLoans TotalProsperPaymentsBilled OnTimeProsperPayments
## Min. :0.000 Min. : 0.0 Min. : 0.00
## 1st Qu.:1.000 1st Qu.: 9.0 1st Qu.: 9.00
## Median :1.000 Median : 16.0 Median : 15.00
## Mean :1.357 Mean : 20.5 Mean : 19.98
## 3rd Qu.:2.000 3rd Qu.: 29.0 3rd Qu.: 28.00
## Max. :7.000 Max. :120.0 Max. :114.00
## NA's :29989 NA's :29989 NA's :29989
## ProsperPaymentsLessThanOneMonthLate ProsperPaymentsOneMonthPlusLate
## Min. : 0.000 Min. : 0.000
## 1st Qu.: 0.000 1st Qu.: 0.000
## Median : 0.000 Median : 0.000
## Mean : 0.479 Mean : 0.041
## 3rd Qu.: 0.000 3rd Qu.: 0.000
## Max. :42.000 Max. :21.000
## NA's :29989 NA's :29989
## ProsperPrincipalBorrowed ProsperPrincipalOutstanding
## Min. : 0 Min. : 0.0
## 1st Qu.: 3000 1st Qu.: 0.0
## Median : 5000 Median : 817.7
## Mean : 7238 Mean : 2120.9
## 3rd Qu.: 9750 3rd Qu.: 3104.2
## Max. :60001 Max. :22538.1
## NA's :29989 NA's :29989
## ScorexChangeAtTimeOfListing LoanCurrentDaysDelinquent
## Min. :-175.000 Min. :0
## 1st Qu.: -27.000 1st Qu.:0
## Median : 0.000 Median :0
## Mean : 4.522 Mean :0
## 3rd Qu.: 35.000 3rd Qu.:0
## Max. : 286.000 Max. :0
## NA's :30083
## LoanFirstDefaultedCycleNumber LoanMonthsSinceOrigination LoanNumber
## Min. : 5.00 Min. : 1.00 Min. : 1
## 1st Qu.:12.50 1st Qu.: 29.00 1st Qu.: 21437
## Median :18.00 Median : 53.00 Median : 39068
## Mean :20.91 Mean : 53.51 Mean : 39965
## 3rd Qu.:29.50 3rd Qu.: 76.00 3rd Qu.: 54246
## Max. :42.00 Max. :100.00 Max. :132453
## NA's :38028
## LoanOriginalAmount LoanOriginationDate LoanOriginationQuarter
## Min. : 1000 Min. :2005-11-15 Q2 2008: 2867
## 1st Qu.: 2550 1st Qu.:2007-11-06 Q3 2008: 2455
## Median : 4500 Median :2009-10-23 Q1 2008: 2038
## Mean : 6189 Mean :2009-10-01 Q2 2007: 1853
## 3rd Qu.: 8000 3rd Qu.:2011-10-07 Q1 2007: 1788
## Max. :35000 Max. :2014-02-21 Q4 2011: 1669
## (Other):25404
## MemberKey MonthlyLoanPayment LP_CustomerPayments
## 63CA34120866140639431C9: 8 Min. : 0.00 Min. : 0
## 739C338135235294782AE75: 8 1st Qu.: 94.15 1st Qu.: 3188
## 7E1733653050264822FAA3D: 8 Median : 171.10 Median : 5445
## C70934206057523078260C7: 8 Mean : 218.78 Mean : 7323
## 16083364744933457E57FB9: 7 3rd Qu.: 297.00 3rd Qu.: 9844
## A833340429888765780A3F0: 7 Max. :2251.51 Max. :40702
## (Other) :38028
## LP_CustomerPrincipalPayments LP_InterestandFees LP_ServiceFees
## Min. : 0 Min. : -0.003 Min. :-589.95
## 1st Qu.: 2550 1st Qu.: 336.135 1st Qu.: -79.83
## Median : 4500 Median : 781.290 Median : -38.35
## Mean : 6183 Mean : 1139.779 Mean : -57.65
## 3rd Qu.: 8000 3rd Qu.: 1537.440 3rd Qu.: -16.62
## Max. :35000 Max. :15617.030 Max. : 2.87
##
## LP_CollectionFees LP_GrossPrincipalLoss LP_NetPrincipalLoss
## Min. :-1996.860 Min. :0 Min. :0
## 1st Qu.: 0.000 1st Qu.:0 1st Qu.:0
## Median : 0.000 Median :0 Median :0
## Mean : -5.104 Mean :0 Mean :0
## 3rd Qu.: 0.000 3rd Qu.:0 3rd Qu.:0
## Max. : 0.000 Max. :0 Max. :0
##
## LP_NonPrincipalRecoverypayments PercentFunded Recommendations
## Min. :0 Min. :0.7000 Min. : 0.00000
## 1st Qu.:0 1st Qu.:1.0000 1st Qu.: 0.00000
## Median :0 Median :1.0000 Median : 0.00000
## Mean :0 Mean :0.9987 Mean : 0.09245
## 3rd Qu.:0 3rd Qu.:1.0000 3rd Qu.: 0.00000
## Max. :0 Max. :1.0110 Max. :39.00000
##
## InvestmentFromFriendsCount InvestmentFromFriendsAmount Investors
## Min. : 0.00000 Min. : 0.00 Min. : 1.0
## 1st Qu.: 0.00000 1st Qu.: 0.00 1st Qu.: 34.0
## Median : 0.00000 Median : 0.00 Median : 74.0
## Mean : 0.05158 Mean : 36.28 Mean : 108.2
## 3rd Qu.: 0.00000 3rd Qu.: 0.00 3rd Qu.: 147.0
## Max. :33.00000 Max. :25000.00 Max. :1189.0
##
## Rating
## D :7318
## C :6586
## B :5772
## A :5708
## AA :4669
## (Other):7899
## NA's : 122
What I see here is that LP_CollectionFees is non-0 primarily for loans with poor credit grades or Prosper ratings, and disproportionately for those whose loans have been charged off or defaulted. All of the borrowers have apparently paid collection fees.
All loans where LP_NetPrincipalLoss is 0 were either charged off, or defaulted. Likewise, all loans where LP_NonPrincipalRecoverypayments is 0 were charged off or defaulted.
All this leads me to provisionally conclude that the LP- variables reflect actual, rather than predicted measures of whether or to what extent a loan has been repaid, and further that the data seems to be reasonably complete and consistent (I examine it more closely in the following sections).
I am planning to primarily look at how well predictors of lender profit correlate with actual profit. For this purpose, I will exclude loans that are currently open, since in those cases the profit is unknown, although it would for example also be possible to look at how well these predictors correlate with the likelihood of loans being current or past due.
The most relevant outcomes for lenders seem to be the following: LoanStatus (current or end status of the loan), LP_CustomerPayments (cumulative payments made by customers, prior to any charge-offs), LoanOriginalAmount (for determining percentage lost/repaid), and LP_NetPrincipalLoss (amount still uncollected after recoveries). LP_ServiceFees and LP_CollectionFees are also relevant to determining final yield/loss, and can be added to LP_NetPrincipalLoss to approximate total loss.
Here I will create a new variable to consolidate the yield and loss measures that I am interested in, so as to avoid unnecessary and (for my current purposes) relatively uninteresting complexity in plotting outcomes. I will also take a closer look at data where yield or loss seem relatively extreme.
# creates new data subset of only closed loans (those for which lender profit can be calculated); creates new measure of overall lender yield from collected yield and loss measures; creates new variable that simply states whether a loan is complete, or not, and creates its numerical boolean variable
lender_data <- data %>%
filter(LoanStatus %in% c("Cancelled","Chargedoff","Completed","Defaulted")) %>%
mutate(PercentYield = (
(LoanOriginalAmount-LP_NetPrincipalLoss+LP_ServiceFees+LP_CollectionFees+LP_NonPrincipalRecoverypayments+LP_InterestandFees)
/LoanOriginalAmount)
-1) %>%
mutate(Completed = factor(ifelse(LoanStatus=="Completed","Completed","Not Completed"))) %>%
mutate(Completed.num = as.numeric(Completed=="Completed"))
# check that calculation done correctly:
lender_data[c(1,4,5),] %>%
select(LoanOriginalAmount, starts_with("LP_"), PercentYield, Completed) %>%
rowid_to_column() %>%
gather(var, value, -rowid) %>%
spread(rowid, value) %>%
print(n = Inf)
## # A tibble: 11 x 4
## var `1` `2` `3`
## <chr> <chr> <chr> <chr>
## 1 Completed Completed Not Completed Not Co…
## 2 LoanOriginalAmount 9425 4000 10000
## 3 LP_CollectionFees 0 0 0
## 4 LP_CustomerPayments 11396.14 521.13 5325.33
## 5 LP_CustomerPrincipalPayments 9425 209.75 3987.33
## 6 LP_GrossPrincipalLoss 0 3790.25 6012.65
## 7 LP_InterestandFees 1971.14 311.38 1338
## 8 LP_NetPrincipalLoss 0 3790.25 6012.67
## 9 LP_NonPrincipalRecoverypayments 0 0 268.96
## 10 LP_ServiceFees -133.18 -9.81 -54.61
## 11 PercentYield 0.195009018567639 -0.87217 -0.446…
summary(lender_data)
## ListingKey ListingNumber ListingCreationDate
## 018A360063948152589C8BE: 2 Min. : 4 Min. :2005-11-09
## 30F435938764424435A1188: 2 1st Qu.: 186264 1st Qu.:2007-08-13
## 32943590099161153292459: 2 Median : 386511 Median :2008-08-21
## 6DFC3591891372387BB41B2: 2 Mean : 369056 Mean :2009-07-14
## 778D35919242972923313E0: 2 3rd Qu.: 524183 3rd Qu.:2011-08-27
## 82FD35914405776692938D4: 2 Max. :1204824 Max. :2014-02-13
## (Other) :55077
## CreditGrade Term LoanStatus
## C : 5649 Min. :12.00 Completed :38074
## D : 5153 1st Qu.:36.00 Chargedoff :11992
## B : 4389 Median :36.00 Defaulted : 5018
## AA : 3509 Mean :36.94 Cancelled : 5
## HR : 3508 3rd Qu.:36.00 Current : 0
## (Other): 6745 Max. :60.00 FinalPaymentInProgress: 0
## NA's :26136 (Other) : 0
## ClosedDate BorrowerAPR BorrowerRate LenderYield
## Min. :2005-11-25 Min. :0.00653 Min. :0.0000 Min. :-0.0100
## 1st Qu.:2009-07-14 1st Qu.:0.14974 1st Qu.:0.1350 1st Qu.: 0.1250
## Median :2011-04-05 Median :0.21434 Median :0.1945 Median : 0.1826
## Mean :2011-03-07 Mean :0.22219 Mean :0.2004 Mean : 0.1903
## 3rd Qu.:2013-01-30 3rd Qu.:0.29510 3rd Qu.:0.2699 3rd Qu.: 0.2572
## Max. :2014-03-10 Max. :0.51229 Max. :0.4975 Max. : 0.4925
## NA's :25
## EstimatedEffectiveYield EstimatedLoss EstimatedReturn
## Min. :-0.183 Min. :0.005 Min. :-0.183
## 1st Qu.: 0.111 1st Qu.:0.052 1st Qu.: 0.078
## Median : 0.172 Median :0.098 Median : 0.114
## Mean : 0.176 Mean :0.094 Mean : 0.108
## 3rd Qu.: 0.247 3rd Qu.:0.140 3rd Qu.: 0.136
## Max. : 0.320 Max. :0.366 Max. : 0.284
## NA's :29084 NA's :29084 NA's :29084
## ProsperRating.num ProsperRating.alpha ProsperScore
## Min. :1.000 D : 5869 Min. : 1.000
## 1st Qu.:2.000 E : 3830 1st Qu.: 5.000
## Median :3.000 C : 3817 Median : 6.000
## Mean :3.663 HR : 3725 Mean : 6.266
## 3rd Qu.:5.000 A : 3608 3rd Qu.: 8.000
## Max. :7.000 (Other): 5156 Max. :11.000
## NA's :29084 NA's :29084 NA's :29084
## ListingCategory.num BorrowerState Occupation
## 1 :17868 CA : 7263 Other :14056
## 0 :16952 FL : 3078 Professional : 6515
## 7 : 6042 IL : 3039 Computer Programmer : 2494
## 3 : 4257 GA : 2783 Administrative Assistant: 1934
## 2 : 3244 TX : 2752 Sales - Commission : 1809
## 4 : 2395 (Other):30659 (Other) :26013
## (Other): 4331 NA's : 5515 NA's : 2268
## EmploymentStatus EmploymentStatusDuration IsBorrowerHomeowner
## Full-time :24958 Min. : 0.00 Mode :logical
## Employed :16491 1st Qu.: 21.00 FALSE:29202
## Not available: 5347 Median : 52.00 TRUE :25887
## Self-employed: 2926 Mean : 80.89
## Part-time : 1056 3rd Qu.:112.00
## (Other) : 2056 Max. :755.00
## NA's : 2255 NA's :7615
## CurrentlyInGroup GroupKey DateCreditPulled
## Mode :logical 783C3371218786870A73D20: 1061 Min. :2005-11-09
## FALSE:43245 3D4D3366260257624AB272D: 806 1st Qu.:2007-08-06
## TRUE :11844 6A3B336601725506917317E: 672 Median :2008-08-20
## FEF83377364176536637E50: 582 Mean :2009-07-14
## C9643379247860156A00EC0: 342 3rd Qu.:2011-08-29
## (Other) : 9206 Max. :2014-02-13
## NA's :42420
## CreditScoreRangeLower CreditScoreRangeUpper FirstRecordedCreditLine
## Min. : 0.0 Min. : 19.0 Min. :1947-08-24
## 1st Qu.:640.0 1st Qu.:659.0 1st Qu.:1990-09-28
## Median :680.0 Median :699.0 Median :1995-10-10
## Mean :671.7 Mean :690.7 Mean :1994-12-01
## 3rd Qu.:720.0 3rd Qu.:739.0 3rd Qu.:2000-01-01
## Max. :880.0 Max. :899.0 Max. :2012-06-19
## NA's :591 NA's :591 NA's :697
## CurrentCreditLines OpenCreditLines TotalCreditLinespast7years
## Min. : 0.000 Min. : 0.000 Min. : 2.00
## 1st Qu.: 6.000 1st Qu.: 5.000 1st Qu.: 15.00
## Median : 9.000 Median : 8.000 Median : 23.00
## Mean : 9.569 Mean : 8.338 Mean : 25.28
## 3rd Qu.:13.000 3rd Qu.:11.000 3rd Qu.: 33.00
## Max. :59.000 Max. :51.000 Max. :136.00
## NA's :7604 NA's :7604 NA's :697
## OpenRevolvingAccounts OpenRevolvingMonthlyPayment InquiriesLast6Months
## Min. : 0.000 Min. : 0.0 Min. : 0.000
## 1st Qu.: 3.000 1st Qu.: 60.0 1st Qu.: 0.000
## Median : 5.000 Median : 183.0 Median : 1.000
## Mean : 6.078 Mean : 325.3 Mean : 2.052
## 3rd Qu.: 8.000 3rd Qu.: 418.0 3rd Qu.: 3.000
## Max. :51.000 Max. :14985.0 Max. :105.000
## NA's :697
## TotalInquiries CurrentDelinquencies AmountDelinquent
## Min. : 0.000 Min. : 0.0000 Min. : 0
## 1st Qu.: 2.000 1st Qu.: 0.0000 1st Qu.: 0
## Median : 5.000 Median : 0.0000 Median : 0
## Mean : 7.167 Mean : 0.9064 Mean : 1051
## 3rd Qu.: 9.000 3rd Qu.: 1.0000 3rd Qu.: 0
## Max. :379.000 Max. :83.0000 Max. :444745
## NA's :1159 NA's :697 NA's :7622
## DelinquenciesLast7Years PublicRecordsLast10Years
## Min. : 0.000 Min. : 0.0000
## 1st Qu.: 0.000 1st Qu.: 0.0000
## Median : 0.000 Median : 0.0000
## Mean : 4.572 Mean : 0.3311
## 3rd Qu.: 4.000 3rd Qu.: 0.0000
## Max. :99.000 Max. :30.0000
## NA's :990 NA's :697
## PublicRecordsLast12Months RevolvingCreditBalance BankcardUtilization
## Min. :0.000 Min. : 0 Min. :0.00
## 1st Qu.:0.000 1st Qu.: 1614 1st Qu.:0.21
## Median :0.000 Median : 6073 Median :0.56
## Mean :0.024 Mean : 15689 Mean :0.53
## 3rd Qu.:0.000 3rd Qu.: 16169 3rd Qu.:0.85
## Max. :7.000 Max. :1435667 Max. :5.95
## NA's :7604 NA's :7604 NA's :7604
## AvailableBankcardCredit TotalTrades TradesNeverDelinquent.per
## Min. : 0 Min. : 0.00 Min. :0.000
## 1st Qu.: 506 1st Qu.: 13.00 1st Qu.:0.770
## Median : 3246 Median : 20.00 Median :0.920
## Mean : 10844 Mean : 21.79 Mean :0.856
## 3rd Qu.: 12046 3rd Qu.: 29.00 3rd Qu.:1.000
## Max. :646285 Max. :126.00 Max. :1.000
## NA's :7544 NA's :7544 NA's :7544
## TradesOpenedLast6Months DebtToIncomeRatio IncomeRange
## Min. : 0.000 Min. : 0.00 $25,000-49,999:16344
## 1st Qu.: 0.000 1st Qu.: 0.13 $50,000-74,999:12789
## Median : 1.000 Median : 0.20 Not displayed : 7741
## Mean : 0.909 Mean : 0.29 $75,000-99,999: 6442
## 3rd Qu.: 1.000 3rd Qu.: 0.30 $100,000+ : 6064
## Max. :20.000 Max. :10.01 $1-24,999 : 4571
## NA's :7544 NA's :4230 (Other) : 1138
## IncomeVerifiable StatedMonthlyIncome LoanKey
## Mode :logical Min. : 0 08C43696561586194AC381C: 2
## FALSE:4301 1st Qu.: 2809 09303699897852595CD59DD: 2
## TRUE :50788 Median : 4167 114D37056655628721BD6C8: 2
## Mean : 5054 156836977849742636AE34F: 2
## 3rd Qu.: 6250 56D73700259224545E36FBC: 2
## Max. :618548 63113695530739927C7EA06: 2
## (Other) :55077
## TotalProsperLoans TotalProsperPaymentsBilled OnTimeProsperPayments
## Min. :0.00 Min. : 0.00 Min. : 0.00
## 1st Qu.:1.00 1st Qu.: 9.00 1st Qu.: 9.00
## Median :1.00 Median : 15.00 Median : 14.00
## Mean :1.33 Mean : 20.08 Mean : 19.49
## 3rd Qu.:1.00 3rd Qu.: 28.00 3rd Qu.: 27.00
## Max. :7.00 Max. :120.00 Max. :114.00
## NA's :44550 NA's :44550 NA's :44550
## ProsperPaymentsLessThanOneMonthLate ProsperPaymentsOneMonthPlusLate
## Min. : 0.00 Min. : 0.00
## 1st Qu.: 0.00 1st Qu.: 0.00
## Median : 0.00 Median : 0.00
## Mean : 0.54 Mean : 0.05
## 3rd Qu.: 0.00 3rd Qu.: 0.00
## Max. :42.00 Max. :21.00
## NA's :44550 NA's :44550
## ProsperPrincipalBorrowed ProsperPrincipalOutstanding
## Min. : 0 Min. : 0
## 1st Qu.: 3000 1st Qu.: 0
## Median : 5000 Median : 1098
## Mean : 7105 Mean : 2333
## 3rd Qu.: 9500 3rd Qu.: 3383
## Max. :60001 Max. :22587
## NA's :44550 NA's :44550
## ScorexChangeAtTimeOfListing LoanCurrentDaysDelinquent
## Min. :-194.00 Min. : 0.0
## 1st Qu.: -32.00 1st Qu.: 0.0
## Median : 0.00 Median : 0.0
## Mean : 1.15 Mean : 314.5
## 3rd Qu.: 32.00 3rd Qu.: 228.0
## Max. : 286.00 Max. :2704.0
## NA's :44647
## LoanFirstDefaultedCycleNumber LoanMonthsSinceOrigination LoanNumber
## Min. : 0.00 Min. : 1.00 Min. : 1
## 1st Qu.: 9.00 1st Qu.: 30.00 1st Qu.: 18295
## Median :14.00 Median : 66.00 Median : 36353
## Mean :16.27 Mean : 55.63 Mean : 37935
## 3rd Qu.:22.00 3rd Qu.: 79.00 3rd Qu.: 53169
## Max. :44.00 Max. :100.00 Max. :132453
## NA's :38145
## LoanOriginalAmount LoanOriginationDate LoanOriginationQuarter
## Min. : 1000 Min. :2005-11-15 Q2 2008: 4344
## 1st Qu.: 2600 1st Qu.:2007-08-24 Q3 2008: 3602
## Median : 4500 Median :2008-09-05 Q2 2007: 3118
## Mean : 6261 Mean :2009-07-28 Q1 2007: 3079
## 3rd Qu.: 8000 3rd Qu.:2011-09-13 Q1 2008: 3074
## Max. :35000 Max. :2014-02-21 Q3 2007: 2671
## (Other):35201
## MemberKey MonthlyLoanPayment LP_CustomerPayments
## 16083364744933457E57FB9: 8 Min. : 0.00 Min. : -2.35
## 63CA34120866140639431C9: 8 1st Qu.: 98.29 1st Qu.: 2029.68
## 739C338135235294782AE75: 8 Median : 172.60 Median : 4208.27
## 7E1733653050264822FAA3D: 8 Mean : 223.72 Mean : 5927.15
## C70934206057523078260C7: 8 3rd Qu.: 300.43 3rd Qu.: 7935.81
## 458E33818543661332BC1BE: 7 Max. :2251.51 Max. :40702.39
## (Other) :55042
## LP_CustomerPrincipalPayments LP_InterestandFees LP_ServiceFees
## Min. : 0 Min. : -2.35 Min. :-664.87
## 1st Qu.: 1355 1st Qu.: 331.07 1st Qu.: -73.02
## Median : 3150 Median : 763.98 Median : -34.25
## Mean : 4800 Mean : 1126.98 Mean : -53.66
## 3rd Qu.: 6500 3rd Qu.: 1509.52 3rd Qu.: -14.65
## Max. :35000 Max. :15617.03 Max. : 32.06
##
## LP_CollectionFees LP_GrossPrincipalLoss LP_NetPrincipalLoss
## Min. :-9275 Min. : -94.2 Min. : -954.5
## 1st Qu.: 0 1st Qu.: 0.0 1st Qu.: 0.0
## Median : 0 Median : 0.0 Median : 0.0
## Mean : -25 Mean : 1448.7 Mean : 1409.3
## 3rd Qu.: 0 3rd Qu.: 1430.9 3rd Qu.: 1297.9
## Max. : 0 Max. :25000.0 Max. :25000.0
##
## LP_NonPrincipalRecoverypayments PercentFunded Recommendations
## Min. : 0 Min. :0.7000 Min. : 0.00000
## 1st Qu.: 0 1st Qu.:1.0000 1st Qu.: 0.00000
## Median : 0 Median :1.0000 Median : 0.00000
## Mean : 52 Mean :0.9986 Mean : 0.08938
## 3rd Qu.: 0 3rd Qu.:1.0000 3rd Qu.: 0.00000
## Max. :21118 Max. :1.0110 Max. :39.00000
##
## InvestmentFromFriendsCount InvestmentFromFriendsAmount Investors
## Min. : 0.00000 Min. : 0.00 Min. : 1.0
## 1st Qu.: 0.00000 1st Qu.: 0.00 1st Qu.: 31.0
## Median : 0.00000 Median : 0.00 Median : 69.0
## Mean : 0.04611 Mean : 33.59 Mean : 104.9
## 3rd Qu.: 0.00000 3rd Qu.: 0.00 3rd Qu.: 142.0
## Max. :33.00000 Max. :25000.00 Max. :1189.0
##
## Rating PercentYield Completed Completed.num
## D :11022 Min. :-1.00092 Completed :38074 Min. :0.0000
## C : 9466 1st Qu.:-0.16895 Not Completed:17015 1st Qu.:0.0000
## B : 7762 Median : 0.10328 Median :1.0000
## HR : 7233 Mean :-0.02445 Mean :0.6911
## E : 7119 3rd Qu.: 0.22522 3rd Qu.:1.0000
## (Other):12356 Max. : 1.72976 Max. :1.0000
## NA's : 131
Everything seems to check out with the PercentYield calculation.
# extreme gains
summary(filter(lender_data, PercentYield > 1))
## ListingKey ListingNumber ListingCreationDate
## 086C3431914095124482AF4:1 Min. : 26634 Min. :2006-07-25
## 26C03463657651689E79CAB:1 1st Qu.:257558 1st Qu.:2007-12-25
## 2A97342295011203308BCFB:1 Median :350085 Median :2008-06-13
## C3383423078575403A751E5:1 Mean :298617 Mean :2008-04-13
## D21E3365863982309A3ECAC:1 3rd Qu.:386123 3rd Qu.:2008-08-20
## F1A6342598834278592579C:1 Max. :426236 Max. :2009-09-29
## (Other) :1
## CreditGrade Term LoanStatus
## HR :4 Min. :36 Defaulted :4
## E :2 1st Qu.:36 Completed :2
## NC :0 Median :36 Chargedoff :1
## D :0 Mean :36 Cancelled :0
## C :0 3rd Qu.:36 Current :0
## (Other):0 Max. :36 FinalPaymentInProgress:0
## NA's :1 (Other) :0
## ClosedDate BorrowerAPR BorrowerRate LenderYield
## Min. :2008-01-19 Min. :0.2691 Min. :0.2500 Min. :0.2450
## 1st Qu.:2008-12-24 1st Qu.:0.3218 1st Qu.:0.3000 1st Qu.:0.2900
## Median :2011-10-23 Median :0.3745 Median :0.3500 Median :0.3400
## Mean :2010-09-11 Mean :0.3468 Mean :0.3214 Mean :0.3121
## 3rd Qu.:2011-12-31 3rd Qu.:0.3745 3rd Qu.:0.3500 3rd Qu.:0.3400
## Max. :2013-01-18 Max. :0.3915 Max. :0.3500 Max. :0.3400
##
## EstimatedEffectiveYield EstimatedLoss EstimatedReturn ProsperRating.num
## Min. :0.1299 Min. :0.19 Min. :0.1299 Min. :1
## 1st Qu.:0.1299 1st Qu.:0.19 1st Qu.:0.1299 1st Qu.:1
## Median :0.1299 Median :0.19 Median :0.1299 Median :1
## Mean :0.1299 Mean :0.19 Mean :0.1299 Mean :1
## 3rd Qu.:0.1299 3rd Qu.:0.19 3rd Qu.:0.1299 3rd Qu.:1
## Max. :0.1299 Max. :0.19 Max. :0.1299 Max. :1
## NA's :6 NA's :6 NA's :6 NA's :6
## ProsperRating.alpha ProsperScore ListingCategory.num BorrowerState
## HR :1 Min. :5 0 :2 PA :2
## NC :0 1st Qu.:5 1 :1 MD :1
## E :0 Median :5 2 :1 OR :1
## D :0 Mean :5 4 :1 RI :1
## C :0 3rd Qu.:5 6 :1 TX :1
## (Other):0 Max. :5 7 :1 VA :1
## NA's :6 NA's :6 (Other):0 (Other):0
## Occupation EmploymentStatus EmploymentStatusDuration
## Clerical :2 Full-time :6 Min. :10.00
## Computer Programmer:2 Not available:1 1st Qu.:11.25
## Professional :1 Employed :0 Median :18.00
## Skilled Labor :1 Not employed :0 Mean :30.50
## Teacher :1 Other :0 3rd Qu.:27.00
## Accountant/CPA :0 Part-time :0 Max. :98.00
## (Other) :0 (Other) :0 NA's :1
## IsBorrowerHomeowner CurrentlyInGroup GroupKey
## Mode :logical Mode :logical 00343376901312423168731:0
## FALSE:4 FALSE:7 00943382969547936B0C529:0
## TRUE :3 00AE3392027644405556335:0
## 016833805323396548B2370:0
## 01A133661136027706728BE:0
## (Other) :0
## NA's :7
## DateCreditPulled CreditScoreRangeLower CreditScoreRangeUpper
## Min. :2006-07-20 Min. :520.0 Min. :539.0
## 1st Qu.:2007-12-01 1st Qu.:520.0 1st Qu.:539.0
## Median :2008-06-13 Median :540.0 Median :559.0
## Mean :2008-03-28 Mean :548.6 Mean :567.6
## 3rd Qu.:2008-07-30 3rd Qu.:560.0 3rd Qu.:579.0
## Max. :2009-09-09 Max. :620.0 Max. :639.0
##
## FirstRecordedCreditLine CurrentCreditLines OpenCreditLines
## Min. :1990-11-20 Min. : 1.000 Min. : 2.000
## 1st Qu.:1995-08-30 1st Qu.: 3.250 1st Qu.: 3.250
## Median :1996-09-11 Median : 7.500 Median : 4.500
## Mean :1996-12-02 Mean : 6.667 Mean : 5.333
## 3rd Qu.:1997-09-28 3rd Qu.: 8.750 3rd Qu.: 5.750
## Max. :2004-01-23 Max. :13.000 Max. :12.000
## NA's :1 NA's :1
## TotalCreditLinespast7years OpenRevolvingAccounts
## Min. : 7.00 Min. : 0.000
## 1st Qu.:19.00 1st Qu.: 1.500
## Median :26.00 Median : 2.000
## Mean :23.71 Mean : 3.286
## 3rd Qu.:31.00 3rd Qu.: 3.500
## Max. :33.00 Max. :11.000
##
## OpenRevolvingMonthlyPayment InquiriesLast6Months TotalInquiries
## Min. : 0.0 Min. :0 Min. : 1.000
## 1st Qu.: 52.5 1st Qu.:1 1st Qu.: 4.500
## Median : 69.0 Median :2 Median : 8.000
## Mean :133.1 Mean :3 Mean : 9.571
## 3rd Qu.:151.5 3rd Qu.:4 3rd Qu.:12.000
## Max. :455.0 Max. :9 Max. :25.000
##
## CurrentDelinquencies AmountDelinquent DelinquenciesLast7Years
## Min. :0.000 Min. : 0.0 Min. : 0.00
## 1st Qu.:1.000 1st Qu.: 115.5 1st Qu.: 1.00
## Median :2.000 Median : 444.0 Median :10.00
## Mean :3.286 Mean : 468.8 Mean :10.86
## 3rd Qu.:5.000 3rd Qu.: 697.5 3rd Qu.:11.00
## Max. :9.000 Max. :1137.0 Max. :42.00
## NA's :1
## PublicRecordsLast10Years PublicRecordsLast12Months RevolvingCreditBalance
## Min. :0.0000 Min. :0 Min. : 965
## 1st Qu.:0.0000 1st Qu.:0 1st Qu.: 1178
## Median :0.0000 Median :0 Median : 1852
## Mean :0.2857 Mean :0 Mean : 3756
## 3rd Qu.:0.5000 3rd Qu.:0 3rd Qu.: 3407
## Max. :1.0000 Max. :0 Max. :12989
## NA's :1 NA's :1
## BankcardUtilization AvailableBankcardCredit TotalTrades
## Min. :0.7200 Min. : 0.0 Min. : 5.00
## 1st Qu.:0.9925 1st Qu.: 0.0 1st Qu.:17.75
## Median :1.0400 Median : 40.5 Median :20.50
## Mean :1.0733 Mean :117.5 Mean :19.67
## 3rd Qu.:1.1100 3rd Qu.:124.5 3rd Qu.:24.00
## Max. :1.5300 Max. :485.0 Max. :30.00
## NA's :1 NA's :1 NA's :1
## TradesNeverDelinquent.per TradesOpenedLast6Months DebtToIncomeRatio
## Min. :0.2700 Min. :0.00 Min. :0.130
## 1st Qu.:0.5950 1st Qu.:0.00 1st Qu.:0.135
## Median :0.6700 Median :0.00 Median :0.160
## Mean :0.6383 Mean :0.50 Mean :0.210
## 3rd Qu.:0.7750 3rd Qu.:0.75 3rd Qu.:0.285
## Max. :0.8400 Max. :2.00 Max. :0.340
## NA's :1 NA's :1
## IncomeRange IncomeVerifiable StatedMonthlyIncome
## $25,000-49,999:3 Mode:logical Min. :2000
## $50,000-74,999:2 TRUE:7 1st Qu.:2125
## Not displayed :1 Median :2730
## $75,000-99,999:1 Mean :3803
## Not employed :0 3rd Qu.:5277
## $0 :0 Max. :7083
## (Other) :0
## LoanKey TotalProsperLoans TotalProsperPaymentsBilled
## 3BD634237789212566E5A34:1 Min. :1 Min. :12
## 47A7336630616348408D321:1 1st Qu.:1 1st Qu.:12
## 5CC5343201507741750BCD4:1 Median :1 Median :12
## 5F1835683805778648B4FA5:1 Mean :1 Mean :12
## 6BC834233584328164F4A6B:1 3rd Qu.:1 3rd Qu.:12
## 70EA33942910132621878E3:1 Max. :1 Max. :12
## (Other) :1 NA's :6 NA's :6
## OnTimeProsperPayments ProsperPaymentsLessThanOneMonthLate
## Min. :12 Min. :0
## 1st Qu.:12 1st Qu.:0
## Median :12 Median :0
## Mean :12 Mean :0
## 3rd Qu.:12 3rd Qu.:0
## Max. :12 Max. :0
## NA's :6 NA's :6
## ProsperPaymentsOneMonthPlusLate ProsperPrincipalBorrowed
## Min. :0 Min. :2000
## 1st Qu.:0 1st Qu.:2000
## Median :0 Median :2000
## Mean :0 Mean :2000
## 3rd Qu.:0 3rd Qu.:2000
## Max. :0 Max. :2000
## NA's :6 NA's :6
## ProsperPrincipalOutstanding ScorexChangeAtTimeOfListing
## Min. :1546 Min. :1
## 1st Qu.:1546 1st Qu.:1
## Median :1546 Median :1
## Mean :1546 Mean :1
## 3rd Qu.:1546 3rd Qu.:1
## Max. :1546 Max. :1
## NA's :6 NA's :6
## LoanCurrentDaysDelinquent LoanFirstDefaultedCycleNumber
## Min. : 0.0 Min. : 1.0
## 1st Qu.: 132.0 1st Qu.: 6.0
## Median : 271.0 Median :14.0
## Mean : 565.4 Mean :16.4
## 3rd Qu.:1017.0 3rd Qu.:21.0
## Max. :1389.0 Max. :40.0
## NA's :2
## LoanMonthsSinceOrigination LoanNumber LoanOriginalAmount
## Min. :53.00 Min. : 2265 Min. :1000
## 1st Qu.:66.50 1st Qu.:25008 1st Qu.:1000
## Median :69.00 Median :33200 Median :1000
## Mean :70.71 Mean :28076 Mean :1007
## 3rd Qu.:74.50 3rd Qu.:36071 3rd Qu.:1000
## Max. :91.00 Max. :38907 Max. :1050
##
## LoanOriginationDate LoanOriginationQuarter MemberKey
## Min. :2006-08-02 Q2 2008:2 07553392568434371B69BEA:1
## 1st Qu.:2008-01-02 Q3 2008:2 0E6E3429920780746133A4B:1
## Median :2008-06-23 Q3 2006:1 2F503420860619469E1B2B2:1
## Mean :2008-04-24 Q3 2007:1 6B093422578451456F5E359:1
## 3rd Qu.:2008-09-01 Q4 2009:1 90713365131318276755B6C:1
## Max. :2009-10-15 Q4 2005:0 DB59342339253365288B87F:1
## (Other):0 (Other) :1
## MonthlyLoanPayment LP_CustomerPayments LP_CustomerPrincipalPayments
## Min. : 0.00 Min. : 41.91 Min. : 0.0
## 1st Qu.: 0.00 1st Qu.: 476.66 1st Qu.: 180.3
## Median : 0.00 Median :1794.40 Median : 712.8
## Mean :16.34 Mean :1331.67 Mean : 547.6
## 3rd Qu.:34.58 3rd Qu.:2049.74 3rd Qu.: 879.8
## Max. :45.24 Max. :2432.58 Max. :1000.0
##
## LP_InterestandFees LP_ServiceFees LP_CollectionFees
## Min. : 24.63 Min. :-26.710 Min. :-161.500
## 1st Qu.: 305.04 1st Qu.:-24.400 1st Qu.: -70.325
## Median :1034.88 Median : -9.530 Median : -45.160
## Mean : 784.12 Mean :-14.334 Mean : -51.132
## 3rd Qu.:1193.34 3rd Qu.: -7.225 3rd Qu.: -5.306
## Max. :1432.58 Max. : -0.850 Max. : 0.000
##
## LP_GrossPrincipalLoss LP_NetPrincipalLoss LP_NonPrincipalRecoverypayments
## Min. : 0.0 Min. :-33.660 Min. : 0.00
## 1st Qu.: 137.1 1st Qu.: 0.000 1st Qu.: 13.04
## Median : 337.2 Median : 0.000 Median : 29.79
## Mean : 464.4 Mean : -3.986 Mean : 571.45
## 3rd Qu.: 819.7 3rd Qu.: 0.000 3rd Qu.:1119.15
## Max. :1000.0 Max. : 5.760 Max. :1705.98
##
## PercentFunded Recommendations InvestmentFromFriendsCount
## Min. :1 Min. :0 Min. :0
## 1st Qu.:1 1st Qu.:0 1st Qu.:0
## Median :1 Median :0 Median :0
## Mean :1 Mean :0 Mean :0
## 3rd Qu.:1 3rd Qu.:0 3rd Qu.:0
## Max. :1 Max. :0 Max. :0
##
## InvestmentFromFriendsAmount Investors Rating PercentYield
## Min. :0 Min. : 1.00 HR :5 Min. :1.004
## 1st Qu.:0 1st Qu.: 6.50 E :2 1st Qu.:1.116
## Median :0 Median :13.00 NC :0 Median :1.189
## Mean :0 Mean :11.71 D :0 Mean :1.287
## 3rd Qu.:0 3rd Qu.:15.50 C :0 3rd Qu.:1.426
## Max. :0 Max. :24.00 B :0 Max. :1.730
## (Other):0
## Completed Completed.num
## Completed :2 Min. :0.0000
## Not Completed:5 1st Qu.:0.0000
## Median :0.0000
## Mean :0.2857
## 3rd Qu.:0.5000
## Max. :1.0000
##
# extreme losses
summary(filter(lender_data, PercentYield < -1))
## ListingKey ListingNumber ListingCreationDate
## 34CD3587284005601C13ED3:1 Min. :568875 Min. :2012-03-14
## 4CBB3541533232375D4CEA5:1 1st Qu.:655721 1st Qu.:2012-10-01
## 97A13588293719997E19C94:1 Median :781791 Median :2013-04-16
## C0D935652403806997CFF5E:1 Mean :753201 Mean :2013-02-08
## 00003546482094282EF90E5:0 3rd Qu.:879270 3rd Qu.:2013-08-24
## 00013542762124763F20254:0 Max. :880346 Max. :2013-08-26
## (Other) :0
## CreditGrade Term LoanStatus
## NC :0 Min. :36 Chargedoff :4
## HR :0 1st Qu.:36 Cancelled :0
## E :0 Median :36 Completed :0
## D :0 Mean :42 Current :0
## C :0 3rd Qu.:42 Defaulted :0
## (Other):0 Max. :60 FinalPaymentInProgress:0
## NA's :4 (Other) :0
## ClosedDate BorrowerAPR BorrowerRate LenderYield
## Min. :2012-08-23 Min. :0.1370 Min. :0.1089 Min. :0.0989
## 1st Qu.:2013-03-19 1st Qu.:0.2044 1st Qu.:0.1790 1st Qu.:0.1690
## Median :2013-09-27 Median :0.2762 Median :0.2442 Median :0.2341
## Mean :2013-07-20 Mean :0.2618 Mean :0.2287 Mean :0.2187
## 3rd Qu.:2014-01-28 3rd Qu.:0.3335 3rd Qu.:0.2939 3rd Qu.:0.2838
## Max. :2014-01-28 Max. :0.3580 Max. :0.3177 Max. :0.3077
##
## EstimatedEffectiveYield EstimatedLoss EstimatedReturn
## Min. :0.0956 Min. :0.0299 Min. :0.0657
## 1st Qu.:0.1589 1st Qu.:0.0599 1st Qu.:0.0990
## Median :0.2156 Median :0.1037 Median :0.1119
## Mean :0.2041 Mean :0.1006 Mean :0.1035
## 3rd Qu.:0.2608 3rd Qu.:0.1444 3rd Qu.:0.1164
## Max. :0.2896 Max. :0.1650 Max. :0.1246
##
## ProsperRating.num ProsperRating.alpha ProsperScore ListingCategory.num
## Min. :1.00 HR :1 Min. :3.00 1 :2
## 1st Qu.:1.75 E :1 1st Qu.:3.00 7 :1
## Median :3.00 C :1 Median :4.00 18 :1
## Mean :3.25 A :1 Mean :4.75 0 :0
## 3rd Qu.:4.50 NC :0 3rd Qu.:5.75 2 :0
## Max. :6.00 D :0 Max. :8.00 3 :0
## (Other):0 (Other):0
## BorrowerState Occupation EmploymentStatus
## CA :1 Administrative Assistant:1 Employed :4
## PA :1 Computer Programmer :1 Full-time :0
## VA :1 Executive :1 Not available:0
## WA :1 Truck Driver :1 Not employed :0
## AK :0 Accountant/CPA :0 Other :0
## AL :0 Analyst :0 Part-time :0
## (Other):0 (Other) :0 (Other) :0
## EmploymentStatusDuration IsBorrowerHomeowner CurrentlyInGroup
## Min. : 2.00 Mode :logical Mode :logical
## 1st Qu.: 26.75 FALSE:1 FALSE:4
## Median : 39.00 TRUE :3
## Mean : 60.25
## 3rd Qu.: 72.50
## Max. :161.00
##
## GroupKey DateCreditPulled CreditScoreRangeLower
## 00343376901312423168731:0 Min. :2012-03-14 Min. :660
## 00943382969547936B0C529:0 1st Qu.:2012-10-01 1st Qu.:660
## 00AE3392027644405556335:0 Median :2013-04-16 Median :710
## 016833805323396548B2370:0 Mean :2013-02-08 Mean :715
## 01A133661136027706728BE:0 3rd Qu.:2013-08-24 3rd Qu.:765
## (Other) :0 Max. :2013-08-26 Max. :780
## NA's :4
## CreditScoreRangeUpper FirstRecordedCreditLine CurrentCreditLines
## Min. :679 Min. :1980-05-31 Min. : 0.00
## 1st Qu.:679 1st Qu.:1985-10-09 1st Qu.: 4.50
## Median :729 Median :1991-08-23 Median :10.00
## Mean :734 Mean :1992-03-14 Mean : 8.75
## 3rd Qu.:784 3rd Qu.:1998-01-26 3rd Qu.:14.25
## Max. :799 Max. :2005-02-10 Max. :15.00
##
## OpenCreditLines TotalCreditLinespast7years OpenRevolvingAccounts
## Min. : 0.00 Min. :12.00 Min. : 0.00
## 1st Qu.: 4.50 1st Qu.:14.25 1st Qu.: 0.75
## Median : 9.50 Median :19.00 Median : 5.50
## Mean : 8.25 Mean :20.50 Mean : 5.50
## 3rd Qu.:13.25 3rd Qu.:25.25 3rd Qu.:10.25
## Max. :14.00 Max. :32.00 Max. :11.00
##
## OpenRevolvingMonthlyPayment InquiriesLast6Months TotalInquiries
## Min. : 0.00 Min. :0.00 Min. :1.00
## 1st Qu.: 18.75 1st Qu.:0.75 1st Qu.:1.00
## Median : 716.50 Median :1.00 Median :1.50
## Mean : 949.00 Mean :1.00 Mean :2.75
## 3rd Qu.:1646.75 3rd Qu.:1.25 3rd Qu.:3.25
## Max. :2363.00 Max. :2.00 Max. :7.00
##
## CurrentDelinquencies AmountDelinquent DelinquenciesLast7Years
## Min. :0.00 Min. : 0.00 Min. : 0.00
## 1st Qu.:0.00 1st Qu.: 0.00 1st Qu.: 0.00
## Median :0.00 Median : 0.00 Median : 2.50
## Mean :0.25 Mean : 51.75 Mean : 8.25
## 3rd Qu.:0.25 3rd Qu.: 51.75 3rd Qu.:10.75
## Max. :1.00 Max. :207.00 Max. :28.00
##
## PublicRecordsLast10Years PublicRecordsLast12Months RevolvingCreditBalance
## Min. :0.00 Min. :0 Min. : 0.0
## 1st Qu.:0.00 1st Qu.:0 1st Qu.: 188.2
## Median :0.00 Median :0 Median :10729.0
## Mean :0.25 Mean :0 Mean :28864.5
## 3rd Qu.:0.25 3rd Qu.:0 3rd Qu.:39405.2
## Max. :1.00 Max. :0 Max. :94000.0
##
## BankcardUtilization AvailableBankcardCredit TotalTrades
## Min. :0.0000 Min. : 0.00 Min. :11.00
## 1st Qu.:0.3825 1st Qu.: 36.75 1st Qu.:11.75
## Median :0.6700 Median : 262.00 Median :17.00
## Mean :0.5850 Mean : 5327.50 Mean :18.25
## 3rd Qu.:0.8725 3rd Qu.: 5552.75 3rd Qu.:23.50
## Max. :1.0000 Max. :20786.00 Max. :28.00
##
## TradesNeverDelinquent.per TradesOpenedLast6Months DebtToIncomeRatio
## Min. :0.380 Min. :0 Min. :0.0400
## 1st Qu.:0.635 1st Qu.:0 1st Qu.:0.1000
## Median :0.840 Median :0 Median :0.1500
## Mean :0.765 Mean :0 Mean :0.1325
## 3rd Qu.:0.970 3rd Qu.:0 3rd Qu.:0.1825
## Max. :1.000 Max. :0 Max. :0.1900
##
## IncomeRange IncomeVerifiable StatedMonthlyIncome
## $25,000-49,999:2 Mode:logical Min. : 2333
## $100,000+ :2 TRUE:4 1st Qu.: 2833
## Not displayed :0 Median : 9000
## Not employed :0 Mean :10188
## $0 :0 3rd Qu.:16354
## $1-24,999 :0 Max. :20417
## (Other) :0
## LoanKey TotalProsperLoans TotalProsperPaymentsBilled
## 1724369081908936857F8EF:1 Min. : NA Min. : NA
## 74FF36470534073180AA9D0:1 1st Qu.: NA 1st Qu.: NA
## 9D24367018687929200F8B0:1 Median : NA Median : NA
## E8EC36918514705070CF2EE:1 Mean :NaN Mean :NaN
## 00003683605746079487FF7:0 3rd Qu.: NA 3rd Qu.: NA
## 00013421083473792D70F75:0 Max. : NA Max. : NA
## (Other) :0 NA's :4 NA's :4
## OnTimeProsperPayments ProsperPaymentsLessThanOneMonthLate
## Min. : NA Min. : NA
## 1st Qu.: NA 1st Qu.: NA
## Median : NA Median : NA
## Mean :NaN Mean :NaN
## 3rd Qu.: NA 3rd Qu.: NA
## Max. : NA Max. : NA
## NA's :4 NA's :4
## ProsperPaymentsOneMonthPlusLate ProsperPrincipalBorrowed
## Min. : NA Min. : NA
## 1st Qu.: NA 1st Qu.: NA
## Median : NA Median : NA
## Mean :NaN Mean :NaN
## 3rd Qu.: NA 3rd Qu.: NA
## Max. : NA Max. : NA
## NA's :4 NA's :4
## ProsperPrincipalOutstanding ScorexChangeAtTimeOfListing
## Min. : NA Min. : NA
## 1st Qu.: NA 1st Qu.: NA
## Median : NA Median : NA
## Mean :NaN Mean :NaN
## 3rd Qu.: NA 3rd Qu.: NA
## Max. : NA Max. : NA
## NA's :4 NA's :4
## LoanCurrentDaysDelinquent LoanFirstDefaultedCycleNumber
## Min. :163.0 Min. :5.00
## 1st Qu.:163.0 1st Qu.:5.00
## Median :285.5 Median :5.00
## Mean :355.0 Mean :5.25
## 3rd Qu.:477.5 3rd Qu.:5.25
## Max. :686.0 Max. :6.00
##
## LoanMonthsSinceOrigination LoanNumber LoanOriginalAmount
## Min. : 7.00 Min. : 62368 Min. : 3000
## 1st Qu.: 7.00 1st Qu.: 75132 1st Qu.: 3000
## Median :11.00 Median : 89780 Median :10439
## Mean :13.25 Mean : 85586 Mean :12220
## 3rd Qu.:17.25 3rd Qu.:100235 3rd Qu.:19658
## Max. :24.00 Max. :100414 Max. :25000
##
## LoanOriginationDate LoanOriginationQuarter MemberKey
## Min. :2012-03-23 Q3 2013:2 0CE43587499687303B53D9D:1
## 1st Qu.:2012-10-17 Q1 2012:1 39A5356469782230058076F:1
## Median :2013-04-27 Q4 2012:1 49053588028781958EA2262:1
## Mean :2013-02-17 Q4 2005:0 6AB43541952481860D5AA31:1
## 3rd Qu.:2013-08-28 Q1 2006:0 00003397697413387CAF966:0
## Max. :2013-08-28 Q2 2006:0 000035297015484885C64F8:0
## (Other):0 (Other) :0
## MonthlyLoanPayment LP_CustomerPayments LP_CustomerPrincipalPayments
## Min. :125.0 Min. :-2.3499 Min. :0
## 1st Qu.:129.0 1st Qu.:-0.5875 1st Qu.:0
## Median :355.0 Median : 0.0000 Median :0
## Mean :375.2 Mean :-0.5875 Mean :0
## 3rd Qu.:601.2 3rd Qu.: 0.0000 3rd Qu.:0
## Max. :665.7 Max. : 0.0000 Max. :0
##
## LP_InterestandFees LP_ServiceFees LP_CollectionFees LP_GrossPrincipalLoss
## Min. :-2.3499 Min. :0 Min. :-2.360 Min. : 3000
## 1st Qu.:-0.5875 1st Qu.:0 1st Qu.:-1.865 1st Qu.: 3000
## Median : 0.0000 Median :0 Median :-1.050 Median :10439
## Mean :-0.5875 Mean :0 Mean :-1.115 Mean :12220
## 3rd Qu.: 0.0000 3rd Qu.:0 3rd Qu.:-0.300 3rd Qu.:19659
## Max. : 0.0000 Max. :0 Max. : 0.000 Max. :25000
##
## LP_NetPrincipalLoss LP_NonPrincipalRecoverypayments PercentFunded
## Min. : 3000 Min. :0 Min. :0.8721
## 1st Qu.: 3000 1st Qu.:0 1st Qu.:0.9680
## Median :10439 Median :0 Median :1.0000
## Mean :12220 Mean :0 Mean :0.9680
## 3rd Qu.:19659 3rd Qu.:0 3rd Qu.:1.0000
## Max. :25000 Max. :0 Max. :1.0000
##
## Recommendations InvestmentFromFriendsCount InvestmentFromFriendsAmount
## Min. :0 Min. :0 Min. :0
## 1st Qu.:0 1st Qu.:0 1st Qu.:0
## Median :0 Median :0 Median :0
## Mean :0 Mean :0 Mean :0
## 3rd Qu.:0 3rd Qu.:0 3rd Qu.:0
## Max. :0 Max. :0 Max. :0
##
## Investors Rating PercentYield Completed
## Min. : 1.00 HR :1 Min. :-1.001 Completed :0
## 1st Qu.: 5.50 E :1 1st Qu.:-1.001 Not Completed:4
## Median : 38.00 C :1 Median :-1.000
## Mean : 65.75 A :1 Mean :-1.000
## 3rd Qu.: 98.25 NC :0 3rd Qu.:-1.000
## Max. :186.00 D :0 Max. :-1.000
## (Other):0
## Completed.num
## Min. :0
## 1st Qu.:0
## Median :0
## Mean :0
## 3rd Qu.:0
## Max. :0
##
For the PercentYield measure, extreme gains seem to be cases of loans being either defaulted on or charged off - but fully, or almost fully recovered, with large interest payments.
Extreme losses are due to loans that were charged off - where no payments were ever made or recovered, and where investors lost money on collection fees. Overall, the data looks sensible.
Now I want to take a closer look at the relationship between this measure, and the other primary measure of interest: LoanStatus.
LoanStatus by PercentYieldtable_stats(lender_data, "LoanStatus", "PercentYield")
Here it can be seen that cancelled loans do nor result in any money gained or lost by the lender, that lenders earn on average around 20% on top of their initial investment for completed loans, and that they lose about 50% of their original investment, on average, for charged off or defaulted loans, signifying that lenders may be taking a substantial risk by investing, particularly with loans that are likely to be defaulted on or charged off. As the confidence intervals in the table above show, this trend is distinct and stable.
Since only 5 loans in the relevant data set were ever cancelled, I will exclude this category from further analysis.
The interquartile range and average values of PercentYield are roughly equal between loans that have been charged off, and those that have been defaulted on, suggesting that the relevant measure might be simply whether the loan was completed, or not (although ultimately, loans defaulted on may still be repaid at some point).
# look more closely at loans charged off
summary(filter(lender_data, LoanStatus == "Chargedoff"))
## ListingKey ListingNumber ListingCreationDate
## 00013542762124763F20254: 1 Min. : 156 Min. :2006-02-11
## 000433785890431972B4743: 1 1st Qu.:188514 1st Qu.:2007-08-16
## 0005353671687550573289D: 1 Median :369917 Median :2008-07-19
## 001035373445372274F74E2: 1 Mean :364865 Mean :2009-07-14
## 00143395229257559A91663: 1 3rd Qu.:534036 3rd Qu.:2011-10-19
## 00153399719267548BE59C1: 1 Max. :932346 Max. :2013-09-26
## (Other) :11986
## CreditGrade Term LoanStatus
## D :1343 Min. :12.00 Chargedoff :11992
## C :1310 1st Qu.:36.00 Cancelled : 0
## HR :1242 Median :36.00 Completed : 0
## E : 946 Mean :38.03 Current : 0
## B : 909 3rd Qu.:36.00 Defaulted : 0
## (Other): 900 Max. :60.00 FinalPaymentInProgress: 0
## NA's :5342 (Other) : 0
## ClosedDate BorrowerAPR BorrowerRate LenderYield
## Min. :2006-11-23 Min. :0.01823 Min. :0.0100 Min. :0.0000
## 1st Qu.:2009-03-11 1st Qu.:0.19003 1st Qu.:0.1769 1st Qu.:0.1650
## Median :2010-08-17 Median :0.26271 Median :0.2400 Median :0.2300
## Mean :2011-01-07 Mean :0.25775 Mean :0.2354 Mean :0.2247
## 3rd Qu.:2013-01-29 3rd Qu.:0.32958 3rd Qu.:0.2975 3rd Qu.:0.2869
## Max. :2014-03-10 Max. :0.46201 Max. :0.4500 Max. :0.4325
##
## EstimatedEffectiveYield EstimatedLoss EstimatedReturn
## Min. :-0.182 Min. :0.006 Min. :-0.182
## 1st Qu.: 0.164 1st Qu.:0.087 1st Qu.: 0.111
## Median : 0.236 Median :0.112 Median : 0.125
## Mean : 0.218 Mean :0.116 Mean : 0.123
## 3rd Qu.: 0.286 3rd Qu.:0.149 3rd Qu.: 0.144
## Max. : 0.320 Max. :0.366 Max. : 0.284
## NA's :6656 NA's :6656 NA's :6656
## ProsperRating.num ProsperRating.alpha ProsperScore
## Min. :1.000 D :1395 Min. : 1.000
## 1st Qu.:2.000 HR :1215 1st Qu.: 4.000
## Median :3.000 E :1131 Median : 5.000
## Mean :2.883 C : 706 Mean : 5.391
## 3rd Qu.:4.000 B : 500 3rd Qu.: 7.000
## Max. :7.000 (Other): 389 Max. :10.000
## NA's :6656 NA's :6656 NA's :6656
## ListingCategory.num BorrowerState Occupation
## 0 :3792 CA :1574 Other :3459
## 1 :3655 FL : 761 Professional :1229
## 7 :1225 GA : 688 Sales - Commission : 485
## 3 :1108 IL : 685 Administrative Assistant: 460
## 2 : 669 TX : 579 Clerical : 444
## 4 : 577 (Other):6665 (Other) :5624
## (Other): 966 NA's :1040 NA's : 291
## EmploymentStatus EmploymentStatusDuration IsBorrowerHomeowner
## Full-time :5343 Min. : 0.00 Mode :logical
## Employed :3529 1st Qu.: 19.00 FALSE:6661
## Not available:1065 Median : 51.00 TRUE :5331
## Self-employed: 897 Mean : 80.18
## Other : 295 3rd Qu.:111.00
## (Other) : 572 Max. :755.00
## NA's : 291 NA's :1358
## CurrentlyInGroup GroupKey DateCreditPulled
## Mode :logical 783C3371218786870A73D20: 291 Min. :2006-01-29
## FALSE:9187 FEF83377364176536637E50: 235 1st Qu.:2007-08-07
## TRUE :2805 3D4D3366260257624AB272D: 139 Median :2008-07-11
## 6A3B336601725506917317E: 120 Mean :2009-07-10
## 9BBE337094173775621CD34: 119 3rd Qu.:2011-10-19
## (Other) :1968 Max. :2013-09-26
## NA's :9120
## CreditScoreRangeLower CreditScoreRangeUpper FirstRecordedCreditLine
## Min. : 0.0 Min. : 19.0 Min. :1955-05-01
## 1st Qu.:600.0 1st Qu.:619.0 1st Qu.:1990-07-01
## Median :660.0 Median :679.0 Median :1995-09-06
## Mean :648.9 Mean :667.9 Mean :1994-10-22
## 3rd Qu.:700.0 3rd Qu.:719.0 3rd Qu.:2000-02-01
## Max. :860.0 Max. :879.0 Max. :2011-08-10
## NA's :48 NA's :48 NA's :76
## CurrentCreditLines OpenCreditLines TotalCreditLinespast7years
## Min. : 0.000 Min. : 0.000 Min. : 2.00
## 1st Qu.: 5.000 1st Qu.: 4.000 1st Qu.: 14.00
## Median : 8.000 Median : 7.000 Median : 22.00
## Mean : 8.846 Mean : 7.728 Mean : 24.58
## 3rd Qu.:12.000 3rd Qu.:10.000 3rd Qu.: 33.00
## Max. :48.000 Max. :43.000 Max. :129.00
## NA's :1356 NA's :1356 NA's :76
## OpenRevolvingAccounts OpenRevolvingMonthlyPayment InquiriesLast6Months
## Min. : 0.000 Min. : 0.0 Min. : 0.000
## 1st Qu.: 2.000 1st Qu.: 42.0 1st Qu.: 0.000
## Median : 5.000 Median : 152.0 Median : 2.000
## Mean : 5.575 Mean : 308.5 Mean : 2.767
## 3rd Qu.: 8.000 3rd Qu.: 383.2 3rd Qu.: 4.000
## Max. :41.000 Max. :14985.0 Max. :105.000
## NA's :76
## TotalInquiries CurrentDelinquencies AmountDelinquent
## Min. : 0.000 Min. : 0.000 Min. : 0
## 1st Qu.: 3.000 1st Qu.: 0.000 1st Qu.: 0
## Median : 6.000 Median : 0.000 Median : 0
## Mean : 8.916 Mean : 1.384 Mean : 1493
## 3rd Qu.: 12.000 3rd Qu.: 1.000 3rd Qu.: 130
## Max. :379.000 Max. :64.000 Max. :444745
## NA's :126 NA's :76 NA's :1358
## DelinquenciesLast7Years PublicRecordsLast10Years
## Min. : 0.000 Min. : 0.0000
## 1st Qu.: 0.000 1st Qu.: 0.0000
## Median : 0.000 Median : 0.0000
## Mean : 6.022 Mean : 0.4553
## 3rd Qu.: 7.000 3rd Qu.: 1.0000
## Max. :99.000 Max. :30.0000
## NA's :120 NA's :76
## PublicRecordsLast12Months RevolvingCreditBalance BankcardUtilization
## Min. :0.0000 Min. : 0.0 Min. :0.0000
## 1st Qu.:0.0000 1st Qu.: 892.8 1st Qu.:0.2100
## Median :0.0000 Median : 4440.5 Median :0.6000
## Mean :0.0351 Mean : 14005.8 Mean :0.5516
## 3rd Qu.:0.0000 3rd Qu.: 13882.0 3rd Qu.:0.8800
## Max. :7.0000 Max. :600223.0 Max. :2.6800
## NA's :1356 NA's :1356 NA's :1356
## AvailableBankcardCredit TotalTrades TradesNeverDelinquent.per
## Min. : 0 Min. : 1.00 Min. :0.000
## 1st Qu.: 175 1st Qu.: 11.00 1st Qu.:0.710
## Median : 1587 Median : 18.00 Median :0.870
## Mean : 6897 Mean : 20.44 Mean :0.815
## 3rd Qu.: 7196 3rd Qu.: 28.00 3rd Qu.:1.000
## Max. :364284 Max. :118.00 Max. :1.000
## NA's :1342 NA's :1342 NA's :1342
## TradesOpenedLast6Months DebtToIncomeRatio IncomeRange
## Min. : 0.000 Min. : 0.0000 $25,000-49,999:4162
## 1st Qu.: 0.000 1st Qu.: 0.1300 $50,000-74,999:2633
## Median : 1.000 Median : 0.2100 Not displayed :1380
## Mean : 1.078 Mean : 0.3392 $1-24,999 :1329
## 3rd Qu.: 2.000 3rd Qu.: 0.3300 $75,000-99,999:1153
## Max. :17.000 Max. :10.0100 $100,000+ : 968
## NA's :1342 NA's :1241 (Other) : 367
## IncomeVerifiable StatedMonthlyIncome LoanKey
## Mode :logical Min. : 0 00023650503696810C531F7: 1
## FALSE:1260 1st Qu.: 2500 0004363753221955965B646: 1
## TRUE :10732 Median : 3750 000836579711360490B130B: 1
## Mean : 4486 000B3656359179267F91999: 1
## 3rd Qu.: 5500 001C336540093530548F61A: 1
## Max. :208333 001E3652350675777DB09A9: 1
## (Other) :11986
## TotalProsperLoans TotalProsperPaymentsBilled OnTimeProsperPayments
## Min. :1.000 Min. : 0.00 Min. : 0.00
## 1st Qu.:1.000 1st Qu.: 9.00 1st Qu.: 8.00
## Median :1.000 Median : 12.00 Median : 12.00
## Mean :1.253 Mean : 18.86 Mean : 18.03
## 3rd Qu.:1.000 3rd Qu.: 25.00 3rd Qu.: 24.00
## Max. :7.000 Max. :103.00 Max. :101.00
## NA's :10015 NA's :10015 NA's :10015
## ProsperPaymentsLessThanOneMonthLate ProsperPaymentsOneMonthPlusLate
## Min. : 0.000 Min. :0.00
## 1st Qu.: 0.000 1st Qu.:0.00
## Median : 0.000 Median :0.00
## Mean : 0.775 Mean :0.06
## 3rd Qu.: 0.000 3rd Qu.:0.00
## Max. :24.000 Max. :8.00
## NA's :10015 NA's :10015
## ProsperPrincipalBorrowed ProsperPrincipalOutstanding
## Min. : 1000 Min. : 0.00
## 1st Qu.: 3000 1st Qu.: 4.33
## Median : 5000 Median : 2068.99
## Mean : 6653 Mean : 3054.05
## 3rd Qu.: 8250 3rd Qu.: 4161.71
## Max. :45000 Max. :21862.26
## NA's :10015 NA's :10015
## ScorexChangeAtTimeOfListing LoanCurrentDaysDelinquent
## Min. :-194.00 Min. : 121.0
## 1st Qu.: -40.00 1st Qu.: 507.8
## Median : -8.00 Median :1389.0
## Mean : -10.56 Mean :1256.0
## 3rd Qu.: 18.00 3rd Qu.:1927.0
## Max. : 214.00 Max. :2704.0
## NA's :10016
## LoanFirstDefaultedCycleNumber LoanMonthsSinceOrigination LoanNumber
## Min. : 1.00 Min. : 5.00 Min. : 59
## 1st Qu.:10.00 1st Qu.:29.00 1st Qu.: 18375
## Median :16.00 Median :68.00 Median : 34842
## Mean :17.06 Mean :55.72 Mean : 37703
## 3rd Qu.:23.00 3rd Qu.:79.00 3rd Qu.: 55294
## Max. :41.00 Max. :97.00 Max. :103467
## NA's :7
## LoanOriginalAmount LoanOriginationDate LoanOriginationQuarter
## Min. : 1000 Min. :2006-02-21 Q2 2008:1093
## 1st Qu.: 3000 1st Qu.:2007-08-28 Q2 2007: 840
## Median : 4500 Median :2008-07-31 Q3 2008: 829
## Mean : 6399 Mean :2009-07-26 Q3 2007: 732
## 3rd Qu.: 8000 3rd Qu.:2011-10-31 Q1 2008: 731
## Max. :25000 Max. :2013-10-01 Q1 2007: 723
## (Other):7044
## MemberKey MonthlyLoanPayment LP_CustomerPayments
## 006C3373804016872128132: 2 Min. : 29.97 Min. : -2.35
## 009C35078002646985845CF: 2 1st Qu.: 112.24 1st Qu.: 789.91
## 00C43387968070538859D91: 2 Median : 173.71 Median : 1788.68
## 01DC337523139473583CD4C: 2 Mean : 235.36 Mean : 2888.32
## 01F733654063535141541E2: 2 3rd Qu.: 308.21 3rd Qu.: 3666.66
## 02493426138410789969A20: 2 Max. :1552.76 Max. :29825.73
## (Other) :11980
## LP_CustomerPrincipalPayments LP_InterestandFees LP_ServiceFees
## Min. : 0.0 Min. : -2.35 Min. :-664.87
## 1st Qu.: 352.1 1st Qu.: 376.45 1st Qu.: -61.53
## Median : 914.4 Median : 793.99 Median : -29.15
## Mean : 1731.2 Mean : 1157.11 Mean : -46.86
## 3rd Qu.: 2093.6 3rd Qu.: 1520.05 3rd Qu.: -12.83
## Max. :24074.3 Max. :14329.49 Max. : 0.00
##
## LP_CollectionFees LP_GrossPrincipalLoss LP_NetPrincipalLoss
## Min. :-9274.75 Min. : 0 Min. : -69.19
## 1st Qu.: -18.75 1st Qu.: 1817 1st Qu.: 1765.05
## Median : 0.00 Median : 3345 Median : 3301.16
## Mean : -49.84 Mean : 4663 Mean : 4608.30
## 3rd Qu.: 0.00 3rd Qu.: 5956 3rd Qu.: 5887.32
## Max. : 0.00 Max. :25000 Max. :25000.00
##
## LP_NonPrincipalRecoverypayments PercentFunded Recommendations
## Min. : 0.0 Min. :0.7012 Min. : 0.00000
## 1st Qu.: 0.0 1st Qu.:1.0000 1st Qu.: 0.00000
## Median : 0.0 Median :1.0000 Median : 0.00000
## Mean : 136.9 Mean :0.9978 Mean : 0.08547
## 3rd Qu.: 0.0 3rd Qu.:1.0000 3rd Qu.: 0.00000
## Max. :21117.9 Max. :1.0000 Max. :16.00000
##
## InvestmentFromFriendsCount InvestmentFromFriendsAmount Investors
## Min. :0.00000 Min. : 0.00 Min. : 1.00
## 1st Qu.:0.00000 1st Qu.: 0.00 1st Qu.: 28.00
## Median :0.00000 Median : 0.00 Median : 61.00
## Mean :0.03619 Mean : 32.14 Mean : 96.11
## 3rd Qu.:0.00000 3rd Qu.: 0.00 3rd Qu.:127.00
## Max. :9.00000 Max. :12500.00 Max. :870.00
##
## Rating PercentYield Completed Completed.num
## D :2738 Min. :-1.0009 Completed : 0 Min. :0
## HR :2457 1st Qu.:-0.7870 Not Completed:11992 1st Qu.:0
## E :2077 Median :-0.5762 Median :0
## C :2016 Mean :-0.5095 Mean :0
## B :1409 3rd Qu.:-0.2864 3rd Qu.:0
## (Other):1289 Max. : 1.5901 Max. :0
## NA's : 6
# look more closely at loans defaulted on
summary(filter(lender_data, LoanStatus == "Defaulted"))
## ListingKey ListingNumber ListingCreationDate
## 00003546482094282EF90E5: 1 Min. : 99 Min. :2006-01-25
## 001C3375545731729D10129: 1 1st Qu.: 69064 1st Qu.:2006-12-01
## 001D33654297803968707DD: 1 Median : 178389 Median :2007-07-29
## 00293413955892317967503: 1 Mean : 232428 Mean :2008-03-21
## 005B3378937131619860EC9: 1 3rd Qu.: 367876 3rd Qu.:2008-07-15
## 00773373220677521177C26: 1 Max. :1099553 Max. :2013-12-27
## (Other) :5012
## CreditGrade Term LoanStatus
## HR : 891 Min. :12.00 Defaulted :5018
## C : 729 1st Qu.:36.00 Cancelled : 0
## D : 684 Median :36.00 Chargedoff : 0
## E : 665 Mean :36.84 Completed : 0
## B : 493 3rd Qu.:36.00 Current : 0
## (Other): 548 Max. :60.00 FinalPaymentInProgress: 0
## NA's :1008 (Other) : 0
## ClosedDate BorrowerAPR BorrowerRate LenderYield
## Min. :2006-09-05 Min. :0.00864 Min. :0.0000 Min. :-0.0100
## 1st Qu.:2007-10-28 1st Qu.:0.17722 1st Qu.:0.1650 1st Qu.: 0.1549
## Median :2009-02-12 Median :0.24001 Median :0.2296 Median : 0.2150
## Mean :2009-07-10 Mean :0.23893 Mean :0.2231 Mean : 0.2121
## 3rd Qu.:2010-08-06 3rd Qu.:0.29776 3rd Qu.:0.2875 3rd Qu.: 0.2700
## Max. :2014-03-04 Max. :0.50633 Max. :0.4975 Max. : 0.4800
##
## EstimatedEffectiveYield EstimatedLoss EstimatedReturn
## Min. :-0.046 Min. :0.006 Min. :-0.046
## 1st Qu.: 0.146 1st Qu.:0.085 1st Qu.: 0.109
## Median : 0.233 Median :0.112 Median : 0.127
## Mean : 0.209 Mean :0.112 Mean : 0.123
## 3rd Qu.: 0.285 3rd Qu.:0.147 3rd Qu.: 0.144
## Max. : 0.320 Max. :0.366 Max. : 0.254
## NA's :4013 NA's :4013 NA's :4013
## ProsperRating.num ProsperRating.alpha ProsperScore ListingCategory.num
## Min. :1.000 D : 282 Min. : 1.00 0 :2903
## 1st Qu.:2.000 HR : 209 1st Qu.: 4.00 1 :1045
## Median :3.000 E : 193 Median : 6.00 3 : 301
## Mean :3.016 C : 134 Mean : 5.62 7 : 266
## 3rd Qu.:4.000 B : 88 3rd Qu.: 7.00 4 : 198
## Max. :7.000 (Other): 99 Max. :11.00 2 : 141
## NA's :4013 NA's :4013 NA's :4013 (Other): 164
## BorrowerState Occupation EmploymentStatus
## CA : 732 Other :1281 Full-time :2217
## GA : 343 Professional : 482 Not available:1204
## TX : 342 Clerical : 268 Employed : 630
## IL : 337 Administrative Assistant: 216 Self-employed: 246
## FL : 222 Sales - Commission : 205 Part-time : 60
## (Other):2453 (Other) :2044 (Other) : 139
## NA's : 589 NA's : 522 NA's : 522
## EmploymentStatusDuration IsBorrowerHomeowner CurrentlyInGroup
## Min. : 0.00 Mode :logical Mode :logical
## 1st Qu.: 19.00 FALSE:2744 FALSE:3155
## Median : 52.00 TRUE :2274 TRUE :1863
## Mean : 79.48
## 3rd Qu.:112.50
## Max. :554.00
## NA's :1727
## GroupKey DateCreditPulled CreditScoreRangeLower
## 783C3371218786870A73D20: 208 Min. :2005-12-11 Min. : 0.0
## 6A3B336601725506917317E: 156 1st Qu.:2006-11-27 1st Qu.:560.0
## 3D4D3366260257624AB272D: 155 Median :2007-07-24 Median :640.0
## FE113364863511529673D04: 99 Mean :2008-03-17 Mean :620.9
## FEF83377364176536637E50: 79 3rd Qu.:2008-07-10 3rd Qu.:680.0
## (Other) :1341 Max. :2013-12-27 Max. :860.0
## NA's :2980 NA's :126
## CreditScoreRangeUpper FirstRecordedCreditLine CurrentCreditLines
## Min. : 19.0 Min. :1947-08-24 Min. : 0.00
## 1st Qu.:579.0 1st Qu.:1990-08-10 1st Qu.: 6.00
## Median :659.0 Median :1995-08-04 Median :10.00
## Mean :639.9 Mean :1994-07-20 Mean :10.64
## 3rd Qu.:699.0 3rd Qu.:1999-08-12 3rd Qu.:14.00
## Max. :879.0 Max. :2009-08-13 Max. :52.00
## NA's :126 NA's :157 NA's :1727
## OpenCreditLines TotalCreditLinespast7years OpenRevolvingAccounts
## Min. : 0.000 Min. : 2.00 Min. : 0.000
## 1st Qu.: 5.000 1st Qu.: 16.00 1st Qu.: 2.000
## Median : 8.000 Median : 24.00 Median : 5.000
## Mean : 9.168 Mean : 26.12 Mean : 5.646
## 3rd Qu.:12.000 3rd Qu.: 35.00 3rd Qu.: 8.000
## Max. :51.000 Max. :101.00 Max. :51.000
## NA's :1727 NA's :157
## OpenRevolvingMonthlyPayment InquiriesLast6Months TotalInquiries
## Min. : 0.0 Min. : 0.000 Min. : 0.00
## 1st Qu.: 25.0 1st Qu.: 1.000 1st Qu.: 4.00
## Median : 155.0 Median : 2.000 Median : 8.00
## Mean : 344.7 Mean : 3.538 Mean : 11.07
## 3rd Qu.: 446.8 3rd Qu.: 5.000 3rd Qu.: 15.00
## Max. :8001.0 Max. :53.000 Max. :158.00
## NA's :157 NA's :243
## CurrentDelinquencies AmountDelinquent DelinquenciesLast7Years
## Min. : 0.000 Min. : 0.0 Min. : 0.000
## 1st Qu.: 0.000 1st Qu.: 0.0 1st Qu.: 0.000
## Median : 0.000 Median : 0.0 Median : 0.000
## Mean : 2.137 Mean : 1207.4 Mean : 6.025
## 3rd Qu.: 2.000 3rd Qu.: 24.5 3rd Qu.: 7.000
## Max. :83.000 Max. :183396.0 Max. :99.000
## NA's :157 NA's :1727 NA's :223
## PublicRecordsLast10Years PublicRecordsLast12Months RevolvingCreditBalance
## Min. : 0.0000 Min. :0.0000 Min. : 0
## 1st Qu.: 0.0000 1st Qu.:0.0000 1st Qu.: 2120
## Median : 0.0000 Median :0.0000 Median : 8233
## Mean : 0.4203 Mean :0.0383 Mean : 20584
## 3rd Qu.: 1.0000 3rd Qu.:0.0000 3rd Qu.: 21938
## Max. :22.0000 Max. :4.0000 Max. :486503
## NA's :157 NA's :1727 NA's :1727
## BankcardUtilization AvailableBankcardCredit TotalTrades
## Min. :0.0000 Min. : 0 Min. : 1.0
## 1st Qu.:0.3400 1st Qu.: 302 1st Qu.:14.0
## Median :0.6900 Median : 2436 Median :22.0
## Mean :0.6133 Mean : 7889 Mean :23.6
## 3rd Qu.:0.9000 3rd Qu.: 9018 3rd Qu.:31.0
## Max. :4.7300 Max. :289168 Max. :86.0
## NA's :1727 NA's :1724 NA's :1724
## TradesNeverDelinquent.per TradesOpenedLast6Months DebtToIncomeRatio
## Min. :0.0000 Min. : 0.000 Min. : 0.0000
## 1st Qu.:0.7300 1st Qu.: 0.000 1st Qu.: 0.1400
## Median :0.9000 Median : 1.000 Median : 0.2200
## Mean :0.8327 Mean : 1.256 Mean : 0.3693
## 3rd Qu.:1.0000 3rd Qu.: 2.000 3rd Qu.: 0.3500
## Max. :1.0000 Max. :11.000 Max. :10.0100
## NA's :1724 NA's :1724 NA's :255
## IncomeRange IncomeVerifiable StatedMonthlyIncome
## Not displayed :1747 Mode :logical Min. : 0
## $25,000-49,999:1290 FALSE:258 1st Qu.: 2500
## $50,000-74,999: 874 TRUE :4760 Median : 3708
## $75,000-99,999: 375 Mean : 4367
## $1-24,999 : 334 3rd Qu.: 5417
## $100,000+ : 322 Max. :58617
## (Other) : 76
## LoanKey TotalProsperLoans
## 000B3366346245964D6187E: 1 Min. :1.000
## 00193564075967640E1A9A1: 1 1st Qu.:1.000
## 00483379319461501511D07: 1 Median :1.000
## 004C3382466805517A0159B: 1 Mean :1.256
## 004E33659258749958952AB: 1 3rd Qu.:1.000
## 004E3381838563573A42B75: 1 Max. :6.000
## (Other) :5012 NA's :4541
## TotalProsperPaymentsBilled OnTimeProsperPayments
## Min. : 0.00 Min. : 0.00
## 1st Qu.: 7.00 1st Qu.: 7.00
## Median : 12.00 Median : 12.00
## Mean : 17.89 Mean : 17.07
## 3rd Qu.: 23.00 3rd Qu.: 21.00
## Max. :101.00 Max. :101.00
## NA's :4541 NA's :4541
## ProsperPaymentsLessThanOneMonthLate ProsperPaymentsOneMonthPlusLate
## Min. : 0.000 Min. :0.000
## 1st Qu.: 0.000 1st Qu.:0.000
## Median : 0.000 Median :0.000
## Mean : 0.692 Mean :0.124
## 3rd Qu.: 0.000 3rd Qu.:0.000
## Max. :26.000 Max. :9.000
## NA's :4541 NA's :4541
## ProsperPrincipalBorrowed ProsperPrincipalOutstanding
## Min. : 1000 Min. : 0
## 1st Qu.: 3000 1st Qu.: 0
## Median : 5000 Median : 1717
## Mean : 6709 Mean : 2933
## 3rd Qu.: 8000 3rd Qu.: 4147
## Max. :53200 Max. :22587
## NA's :4541 NA's :4541
## ScorexChangeAtTimeOfListing LoanCurrentDaysDelinquent
## Min. :-160.000 Min. : 1.0
## 1st Qu.: -40.000 1st Qu.: 175.0
## Median : 0.000 Median : 249.0
## Mean : -6.937 Mean : 451.1
## 3rd Qu.: 21.500 3rd Qu.: 551.8
## Max. : 176.000 Max. :2421.0
## NA's :4543
## LoanFirstDefaultedCycleNumber LoanMonthsSinceOrigination LoanNumber
## Min. : 0.0 Min. : 3.00 Min. : 29
## 1st Qu.: 8.0 1st Qu.:68.00 1st Qu.: 5451
## Median :12.0 Median :79.00 Median : 17895
## Mean :14.3 Mean :71.47 Mean : 23119
## 3rd Qu.:19.0 3rd Qu.:87.00 3rd Qu.: 34604
## Max. :44.0 Max. :98.00 Max. :124070
## NA's :105
## LoanOriginalAmount LoanOriginationDate LoanOriginationQuarter
## Min. : 1000 Min. :2006-01-27 Q1 2007: 568
## 1st Qu.: 2550 1st Qu.:2006-12-13 Q4 2006: 557
## Median : 4275 Median :2007-08-10 Q3 2006: 474
## Mean : 6487 Mean :2008-04-02 Q2 2007: 425
## 3rd Qu.: 8000 3rd Qu.:2008-07-25 Q2 2008: 384
## Max. :25000 Max. :2013-12-31 Q3 2008: 318
## (Other):2292
## MemberKey MonthlyLoanPayment LP_CustomerPayments
## 018B35275926204010E51B6: 2 Min. : 0.00 Min. : 0.0
## 01D33386346150055C7F757: 2 1st Qu.: 99.92 1st Qu.: 452.4
## 01DA3382241797159B9FE89: 2 Median : 167.62 Median : 1297.3
## 03863429108114327FA5713: 2 Mean : 233.49 Mean : 2607.5
## 03F43394048903402B10A91: 2 3rd Qu.: 303.10 3rd Qu.: 3300.1
## 06AC3396777271901E3E43F: 2 Max. :1102.78 Max. :34021.8
## (Other) :5006
## LP_CustomerPrincipalPayments LP_InterestandFees LP_ServiceFees
## Min. : 0.0 Min. : 0.0 Min. :-425.880
## 1st Qu.: 206.1 1st Qu.: 217.5 1st Qu.: -52.258
## Median : 680.9 Median : 568.6 Median : -19.980
## Mean : 1648.5 Mean : 959.0 Mean : -39.681
## 3rd Qu.: 1930.1 3rd Qu.: 1252.5 3rd Qu.: -5.965
## Max. :24939.2 Max. :12242.0 Max. : 32.060
##
## LP_CollectionFees LP_GrossPrincipalLoss LP_NetPrincipalLoss
## Min. :-6221.3 Min. : -94.2 Min. : -954.5
## 1st Qu.: -16.5 1st Qu.: 1759.3 1st Qu.: 1364.7
## Median : 0.0 Median : 3151.0 Median : 2894.3
## Mean : -116.6 Mean : 4761.6 Mean : 4459.2
## 3rd Qu.: 0.0 3rd Qu.: 5998.3 3rd Qu.: 5645.4
## Max. : 0.0 Max. :25000.0 Max. :25000.0
##
## LP_NonPrincipalRecoverypayments PercentFunded Recommendations
## Min. : 0.0 Min. :0.7063 Min. :0.00000
## 1st Qu.: 0.0 1st Qu.:1.0000 1st Qu.:0.00000
## Median : 0.0 Median :1.0000 Median :0.00000
## Mean : 243.7 Mean :0.9993 Mean :0.07553
## 3rd Qu.: 0.0 3rd Qu.:1.0000 3rd Qu.:0.00000
## Max. :13605.4 Max. :1.0000 Max. :7.00000
##
## InvestmentFromFriendsCount InvestmentFromFriendsAmount Investors
## Min. :0.0000 Min. : 0.00 Min. : 1.0
## 1st Qu.:0.0000 1st Qu.: 0.00 1st Qu.: 26.0
## Median :0.0000 Median : 0.00 Median : 57.0
## Mean :0.0283 Mean : 16.61 Mean :100.7
## 3rd Qu.:0.0000 3rd Qu.: 0.00 3rd Qu.:132.0
## Max. :7.0000 Max. :8200.00 Max. :881.0
##
## Rating PercentYield Completed Completed.num
## HR :1100 Min. :-1.0000 Completed : 0 Min. :0
## D : 966 1st Qu.:-0.8547 Not Completed:5018 1st Qu.:0
## C : 863 Median :-0.6678 Median :0
## E : 858 Mean :-0.5176 Mean :0
## B : 581 3rd Qu.:-0.2665 3rd Qu.:0
## (Other): 647 Max. : 1.7298 Max. :0
## NA's : 3
There’s no clear difference between these categories on most measures, aside from charged off loans being at least 121 days delinquent (loans defaulted on may be any number of days delinquent). Loans that were charged off tend to be created and closed slightly later. For all purposes, however, it may make sense to treat these two categories as identical, at least in the course of exploratory analysis.
Before using them as predictors of actual yield, I want to briefly look at the relationship between measures of predicted loss or yield. In this case, EstimatedEffectiveYield vs. EstimatedLoss seem to be the most relevant measures - how much one stands to gain, in total, vs. how much one stands to lose.
Here, and in following cases where I plot correlations between two quantitative variables, I model the trend using both a standard linear regression (orange line), which makes the sometimes obviously inaccurate assumption that the relationship is linear, and a GAM, or general additive model (blue line), which uses smooth functions to capture non-linearities ([https://en.wikipedia.org/wiki/Generalized_additive_model]). I also compute a Pearson correlation for each relationship ([https://en.wikipedia.org/wiki/Pearson_correlation_coefficient]) - which unfortunately also assumes a linear relationship, but should be informative regarding the main issue that I am concerned with: whether higher values of one variable correspond to higher values in the other variable, regardless of the exact shape of the relationship.
EstimatedEffectiveYield vs. EstimatedLoss# for simplicity, although many of the relationships here do not appear simply linear, I assume linear correlations - the main question for this preliminary exploration is usually only whether one variable generally causes the other to increase, or decrease
cor.test(data$EstimatedEffectiveYield, data$EstimatedLoss, method = "pearson")
##
## Pearson's product-moment correlation
##
## data: data$EstimatedEffectiveYield and data$EstimatedLoss
## t = 385.89, df = 84851, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.7956791 0.8005638
## sample estimates:
## cor
## 0.7981346
This plot shows that, as one would expect, when the estimated effective yield (taking into account fees or lost interest on charge-offs) is around or below zero, the estimated loss rises. It also interestingly shows that as the estimated effective yield rises, so does the estimated loss (\(r=0.80\), \(p<.001\)). Presumably based on historical data, then, Prosper predicts that the more one stands to gain, the more one stands to lose. Likely this is due to higher projected profits for higher-interest loans, which tend to be given to customers who cannot qualify for better terms (due, for example, to poor credit or financial insecurity). Lowest estimated risk of loss seems to be around 5% effective yield.
EstimatedLoss vs. LenderYieldcor.test(data$EstimatedLoss, data$LenderYield, method = "pearson")
##
## Pearson's product-moment correlation
##
## data: data$EstimatedLoss and data$LenderYield
## t = 844.2, df = 84851, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.9445879 0.9460197
## sample estimates:
## cor
## 0.9453084
This plot shows clearly that while predicted lender yield increases with predicted loss (\(r=0.95\), \(p<.001\)), at higher levels of predicted loss, the yield ceases to increase, and levels off, or even drops slightly towards higher levels of predicted loss. This is again presumably due to the fact that those loans which lenders profit the most from tend to be higher-interest loans - with those customers who are charged higher interest rates presumably also typically being those most likely to fail to repay their loan.
What one would expect, then, in terms of actual profit, is that as borrower demographics increasingly reflect traits that typically correspond to lower likelihood of repayment, and interest rates rise, lenders are both more likely to lose, and to stand a chance of earning more. The question then is if lenders profit from taking this risk, on average.
First, I want to quickly look at whether Rating and ProsperScore (post-2009) are correlated:
cor.test(lender_data$ProsperRating.num, lender_data$ProsperScore, method = "pearson")
##
## Pearson's product-moment correlation
##
## data: lender_data$ProsperRating.num and lender_data$ProsperScore
## t = 194.55, df = 26003, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.7649226 0.7748224
## sample estimates:
## cor
## 0.7699188
Overall, these variables are clearly correlated (\(r=0.77\), \(p<.001\)), with higher (more favorable) risk scores disproportionately assigned to loans with higher ratings.
RatingHere I will look at how Rating correlates with LoanStatus and PercentYield, or actual lender yield on past loans (pending any possible repayments on loans defaulted on).
LoanStatustable_stats(lender_data, "Rating", "Completed.num")
PercentYieldtable_stats(lender_data, "Rating", "PercentYield")
Overall, loans with higher Prosper ratings are much more likely to be repaid (confidence intervals show that this trend is reliable), and, on average (without taking into account other predictors of profit), they are the only loans that lenders actually profit on.
Overall, as there does not look to be a great difference between loans that were charged off or defaulted on, as discussed above, I will from now on treat the two categories as identical, and simply look at whether loans were completed, or not.
ProsperScoreHere I will look at how ProsperScore correlates with lender profit. Higher Prosper scores should lead to higher chances of repayment, as they correspond to lower risk.
Completedtable_stats(lender_data, "Completed", "ProsperScore")
As can be seen, loans that were completed had higher (better) risk scores, although those that were not are not markedly lower.
PercentYieldtable_stats(filter(lender_data, !is.na(ProsperScore)), "ProsperScore", "PercentYield")
Overall, it’s clear that lenders on average profit more from loans with higher Prosper scores, and that there is less variability in how much lenders yield (whether high or low) when the Prosper score is high. It can also be seen, however, that lenders profit relatively little from borrowers with the highest score (11), and that borrowers with slightly lower scores are on average more profitable, if slightly more risky.
Here I will look at how EstimatedEffectiveYield correlates with lender profit. Higher estimated yield should correspond to higher profit up to a point, although markedly high estimated yields are likely to be offset by higher likelihood of loans not being repaid.
EstimatedEffectiveYieldCompletedtable_stats(lender_data, "Completed", "EstimatedEffectiveYield")
PercentYieldcor.test(lender_data$EstimatedEffectiveYield, lender_data$PercentYield, method = "pearson")
##
## Pearson's product-moment correlation
##
## data: lender_data$EstimatedEffectiveYield and lender_data$PercentYield
## t = -30.966, df = 26003, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.2002832 -0.1768394
## sample estimates:
## cor
## -0.1885881
Overall, loans not completed had somewhat higher estimated yields. Similarly, loans with higher estimated yield are both more likely to show higher actual yield, and significant losses. On average, loans with higher estimated yields are less lucrative (\(r=-0.19\), \(p<.001\)). For loans under 5% effective yield, there seems to be relatively sparse data. Average profits appear to start dropping off noticeably around 1.5% estimated yield.
EstimatedLossHere I will look at how EstimatedLoss correlates with lender profit. Higher estimated loss should correspond to lower profit, although this may be partially offset by higher interest rates on higher-risk loans.
Completedtable_stats(lender_data, "Completed", "EstimatedLoss")
PercentYieldcor.test(lender_data$EstimatedLoss, lender_data$PercentYield, method = "pearson")
##
## Pearson's product-moment correlation
##
## data: lender_data$EstimatedLoss and lender_data$PercentYield
## t = -4.532, df = 26003, p-value = 5.868e-06
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.04023411 -0.01594506
## sample estimates:
## cor
## -0.02809373
Overall, loans which were not completed had higher estimated loss, as expected. For PercentYield, the average trend is not as clear, but it seems that higher estimated losses are in fact offset partially by higher yields, although the trend is towards greater losses at estimated losses increase (\(r=-0.03\), \(p<.001\)). At higher levels of estimated loss, data is more sparse and it’s more difficult to make firm conclusions.
EstimatedReturnHere I will look at how EstimatedReturn correlates with lender profit. Higher estimated return should correspond more profit, as it takes into account potential losses.
Completedtable_stats(lender_data, "Completed", "EstimatedReturn")
PercentYieldcor.test(lender_data$EstimatedReturn, lender_data$PercentYield, method = "pearson")
##
## Pearson's product-moment correlation
##
## data: lender_data$EstimatedReturn and lender_data$PercentYield
## t = -7.4164, df = 26003, p-value = 1.24e-13
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.05806522 -0.03380829
## sample estimates:
## cor
## -0.04594353
Overall, and somewhat surprisingly, loans that are not completed have marginally higher estimated returns, suggesting that the likelihood of loss is not sufficiently taken into account. Similarly, yield is more likely to be relatively low at some (positive) values of estimated returns (around 10-15%), again suggesting that predicted measures don’t adequately predict losses. At relatively high ad low ends of estimated returns, it’s difficult to make conclusions, due to relatively sparse data. On average, however, the trend is towards greater losses as predicted returns increase (\(r=-0.05\), \(p<.001\)).
In sum, it’s not entirely clear how well estimated profit in fact correlates with actual profit, but the plots above suggest that Prosper estimates may not be sufficiently conservative, in terms of predicting lender yield vs. loss.
First, I quickly want to see how well the numerical demographic variables correlate with each other, as well as whether they correlate with Prosper scores.
It’s clear that many of these variables correlate highly with each other, and other don’t, suggesting that they are underlyingly measuring different things. The Prosper scores appear to correlate relatively highly with credit scores, but show low to no correlation with other variables. At the least, it appears that not all demographic variables, aside from credit scores and financial states that are typically reflected by credit scores, contribute much to Prosper ratings.
OccupationHere I will look at which occupations correspond to higher lender profit. One would expect for higher-earning occupations to correspond to higher chances of repayment, and possibly higher profits.
CompletedPercentYieldThere are too many occupations to make easy generalizations - they would likely need to be grouped into a smaller number of categories. However, one can observe general trends - those with higher-paying occupations (or occupational prospects) seem to be more reliable and more profitable borrowers. On the other hand, students in general, as well as several other relatively low-earning professions, are among the bank’s least profitable customers. Notably, realtors are neither profitable nor reliable. If higher earnings are the underlying reason for these trends, one would expect that income categories would show a similar trend.
IncomeRangeHere I will look at which income ranges correspond to higher lender profit. Based on the occupation data, one would expect for those with higher earnings to be more profitable and more likely to repay loans.
Completedtable_stats(lender_data, "IncomeRange", "Completed.num")
PercentYieldtable_stats(lender_data, "IncomeRange", "PercentYield")
Here, it can be seen that as income rises, average yield increases, and both exceptionally high and low yields become less likely. The highest-earning categories are fairly reliably profitable on average, and repay their loans at a rate of around 75-80%.
EmploymentStatusI expect that those who are employed - particularly employed full time - are more likely to repay loans, and more likely to be profitable, on average.
Completed# reorders emplyment status labels
lender_data$EmploymentStatus <- ordered(lender_data$EmploymentStatus, c("Part-time","Employed","Full-time","Retired","Not employed","Self-employed","Not available","Other"))
table_stats(filter(lender_data, !is.na(EmploymentStatus)), "EmploymentStatus", "Completed.num")
PercentYieldtable_stats(lender_data, "EmploymentStatus", "PercentYield")
Here it looks like, unsurprisingly, employed borrowers are more likely to repay loans, and further that only part- and full-time workers are consistently profitable for lenders. Those in the general category of ‘employed,’ however, are not.
EmploymentStatusDurationOne would expect that longer employment status duration would correspond to more stable employment and higher likelihood of repayment and lender profit.
Completedtable_stats(lender_data, "Completed", "EmploymentStatusDuration")
PercentYieldcor.test(lender_data$EmploymentStatusDuration, lender_data$PercentYield, method = "pearson")
##
## Pearson's product-moment correlation
##
## data: lender_data$EmploymentStatusDuration and lender_data$PercentYield
## t = 0.88961, df = 47471, p-value = 0.3737
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.004912687 0.013078073
## sample estimates:
## cor
## 0.004083024
In fact, employment status duration does not seem to predict loan payment or lender profit (\(r=0.004\), \(p=0.37\)).
IsBorrowerHomeownerOne might expect for homeowners, which on average may be more financially stable, to be more likely to repay loans.
Completedtable_stats(lender_data, "Completed", "IsBorrowerHomeowner")
PercentYieldtable_stats(lender_data, "IsBorrowerHomeowner", "PercentYield")
Those who paid off their loans are marginally more likely to be homeowners. Those who are homeowners are not likely to be much more profitable (the confidence intervals overlap).
CreditScoreIn this case, I will average out upper-range and lower-range credit scores (particularly as the two are nearly perfectly correlated). One would expect for those with higher credit scores to be more likely to replay loans, and to be profitable for lenders.
Completedtable_stats(lender_data, "Completed", "CreditScore")
PercentYieldcor.test(lender_data$CreditScore, lender_data$PercentYield, method = "pearson")
##
## Pearson's product-moment correlation
##
## data: lender_data$CreditScore and lender_data$PercentYield
## t = 22.563, df = 54492, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.08788103 0.10451774
## sample estimates:
## cor
## 0.0962061
cor.test(plot_data$CreditScore, plot_data$PercentYield, method = "pearson")
##
## Pearson's product-moment correlation
##
## data: plot_data$CreditScore and plot_data$PercentYield
## t = 20.509, df = 54358, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.07927730 0.09596102
## sample estimates:
## cor
## 0.08762531
Those who pay off loans tend to have slightly higher credit scores. Yield is fairly low at lower credit scores, increasing more reliably above 0 at higher ranges (\(r=0.096\), \(p<0.001\)). Mostly, it is visually clear that low credit scores correspond to lower lender profit.
FirstRecordedCreditLineIt’s not clear here what the prediction to be - those with a longer credit history may be more likely to repay loans, but they may also simply be older.
# creates new credit history age variable by subtracting data of first credit line creation from the listing creation data, and dividing by 365 to get the age of credit history in years
lender_data$CreditHistoryAge <- as.numeric((lender_data$ListingCreationDate - lender_data$FirstRecordedCreditLine)/365)
Completedtable_stats(lender_data, "Completed", "CreditHistoryAge")
PercentYieldcor.test(as.numeric(lender_data$CreditHistoryAge), lender_data$PercentYield, method="pearson")
##
## Pearson's product-moment correlation
##
## data: as.numeric(lender_data$CreditHistoryAge) and lender_data$PercentYield
## t = 3.4478, df = 54386, p-value = 0.0005656
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.006379079 0.023183881
## sample estimates:
## cor
## 0.01478252
It appears that there is not much difference in terms of credit history age (roughly estimated, not accounting for loan date) between completed and non-completed loans. Likewise, there appears to be little effect on lender profit, although it looks like there might be a very slight trend for those with very young credit history to be less profitable (\(r=0.01\), \(p<0.001\)).
OpenRevolvingAccountsIt’s usually considered to be good to have some, but not too many open revolving accounts, so it may be the case that those with either too few, or too many revolving accounts may be less profitable.
Completedtable_stats(lender_data, "Completed", "OpenRevolvingAccounts")
PercentYieldcor.test(lender_data$OpenRevolvingAccounts, lender_data$PercentYield, method = "pearson")
##
## Pearson's product-moment correlation
##
## data: lender_data$OpenRevolvingAccounts and lender_data$PercentYield
## t = 10.126, df = 55082, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.03476782 0.05143873
## sample estimates:
## cor
## 0.04310627
Those who paid off loans have slightly more open revolving accounts, on average. It also seems that those with no to very few revolving accounts may be less profitable, and at higher ranges it is difficult to tell what the trend is (but it does not appear to be a very strong one). Overall, there is a slight trend for those with more open revolving accounts to be more profitable, on average (\(r=0.04\), \(p<0.001\)).
InquiriesLast6MonthsThose with more recent credit inquiries may be more desperate for credit, and therefore less financially stable, and less likely to replay loans.
Completedtable_stats(lender_data, "Completed", "InquiriesLast6Months")
PercentYieldcor.test(lender_data$InquiriesLast6Months, lender_data$PercentYield, method = "pearson")
##
## Pearson's product-moment correlation
##
## data: lender_data$InquiriesLast6Months and lender_data$PercentYield
## t = -35.464, df = 54386, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.1585468 -0.1421182
## sample estimates:
## cor
## -0.1503429
This appears to be a relatively robust predictor - those who did not pay off loans have more recent inquiries. Similarly, the more recent credit inquiries, the lower the profits (this trend remains if sparse data at the higher ranges is filtered out) (\(r=-0.15\), \(p<0.001\)).
AmountDelinquentThose who have a higher amount delinquent at the time the credit information was pulled should be less likely to repay loans, and be less profitable, on average.
Completedtable_stats(lender_data, "Completed", "AmountDelinquent")
PercentYieldAs higher-range data is sparse, I filtered out amounts over $30,000.
cor.test(lender_data$AmountDelinquent, lender_data$PercentYield, method = "pearson")
##
## Pearson's product-moment correlation
##
## data: lender_data$AmountDelinquent and lender_data$PercentYield
## t = -5.5248, df = 47464, p-value = 3.316e-08
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.03433925 -0.01635843
## sample estimates:
## cor
## -0.02535089
It is in fact the case that those who did not pay off loans had a higher amount delinquent at the time the credit information was pulled. Similarly, it seems that lender profit decreases when borrowers have even a relatively small amount of money delinquent, but then stays relatively stable, or deceases gradually. The overall trend is for profit to decrease as amount delinquent increases (\(r=-0.025\), \(p<0.001\)).
DelinquenciesLast7YearsThose who have more delinquencies should be less likely to pay off loans.
Completedtable_stats(lender_data, "Completed", "DelinquenciesLast7Years")
PercentYieldcor.test(lender_data$DelinquenciesLast7Years, lender_data$PercentYield, method = "pearson")
##
## Pearson's product-moment correlation
##
## data: lender_data$DelinquenciesLast7Years and lender_data$PercentYield
## t = -4.499, df = 54095, p-value = 6.84e-06
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.02776243 -0.01091512
## sample estimates:
## cor
## -0.01934015
Those who did not pay off loans in fact have more delinquencies, and there is a slight trend for them to be less profitable (\(r=-0.02\), \(p<0.001\)).
RevolvingCreditBalanceThose who have a higher revolving credit balance may be less likely to repay their loans, although a higher balance does not necessarily reflect financial difficulties (and sometimes in fact reflects greater financial means).
Completedtable_stats(lender_data, "Completed", "RevolvingCreditBalance")
PercentYieldIn this case I again filtered out data above $500,000, due to sparsity.
cor.test(lender_data$RevolvingCreditBalance, lender_data$PercentYield, method = "pearson")
##
## Pearson's product-moment correlation
##
## data: lender_data$RevolvingCreditBalance and lender_data$PercentYield
## t = -1.3865, df = 47482, p-value = 0.1656
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.015356593 0.002631655
## sample estimates:
## cor
## -0.006362984
It looks like there’s little to no difference in revolving credit balance between those who do, and don’t pay off loans. It also looks like revolving credit balance is not a good predictor of lender yield (\(r=-0.006\), \(p=0.17\)).
BankcardUtilizationTypically, a very high percentage of bankcard utilization would likely correspond to lower financial means, but a very low percentage may also reflect a lack of history of paying off loans.
Completedtable_stats(lender_data, "Completed", "BankcardUtilization")
PercentYieldcor.test(lender_data$BankcardUtilization, lender_data$PercentYield, method = "pearson")
##
## Pearson's product-moment correlation
##
## data: lender_data$BankcardUtilization and lender_data$PercentYield
## t = 8.2455, df = 47482, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.02882841 0.04679167
## sample estimates:
## cor
## 0.0378131
There is relatively little difference between those who have and haven’t paid off loans in terms of bankcard utilization, although those who have not completed loans have relatively high bankcard utilization. In terms of yield, there is a slight trend for those with low bankcard utilization to be less profitable than those with bankcard utilization (\(r=0.04\), \(p<0.001\)), particularly starting around 1%. Those with higher bankcard utilization, however, appear to be less profitable (at very high levels, data is sparse).
DebtToIncomeRatioThose with a higher debt to income ratio should be less likely to pay off their loans.
Completedtable_stats(lender_data, "Completed", "DebtToIncomeRatio")
PercentYieldcor.test(lender_data$DebtToIncomeRatio, lender_data$PercentYield, method = "pearson")
##
## Pearson's product-moment correlation
##
## data: lender_data$DebtToIncomeRatio and lender_data$PercentYield
## t = -8.3542, df = 50852, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.04569820 -0.02833931
## sample estimates:
## cor
## -0.03702155
Those who have not paid off their loans have a higher debt to income ratio, and there appears to be a clear trend for those with a higher ratio to be less profitable (\(r=-0.04\), \(p<0.001\)).
IncomeVerifiableThose with verifiable income may be more likely to pay off loans.
Completedtable_stats(lender_data, "Completed", "IncomeVerifiable")
PercentYieldtable_stats(lender_data, "IncomeVerifiable", "PercentYield")
There is very little difference between those who have and haven’t paid off loans, in terms of whether their income is verifiable. However, those whose income is not verifiable tend to be less profitable.
TotalTradesThe prediction here is not clear to me - whether more trade lines opened in the past have any effect, one way or the other, on likelihood of loan repayment. It may be possible that those with more trade lines have more financial means, at the least.
Completedtable_stats(lender_data, "Completed", "TotalTrades")
PercentYieldcor.test(lender_data$TotalTrades, lender_data$PercentYield, method = "pearson")
##
## Pearson's product-moment correlation
##
## data: lender_data$TotalTrades and lender_data$PercentYield
## t = 4.6424, df = 47542, p-value = 3.453e-06
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.01230008 0.03026956
## sample estimates:
## cor
## 0.02128654
In fact, there is no clear trend here (those who did not pay off loans have marginally fewer trades), although those with more past trade lines may be very marginally more profitable (\(r=0.02\), \(p<0.001\)).
TradesNeverDelinquentAs typical, those with more delinquencies should be less likely to pay off loans.
Completedtable_stats(lender_data, "Completed", "TradesNeverDelinquent.per")
PercentYieldcor.test(lender_data$TradesNeverDelinquent.per, lender_data$PercentYield, method = "pearson")
##
## Pearson's product-moment correlation
##
## data: lender_data$TradesNeverDelinquent.per and lender_data$PercentYield
## t = 1.9584, df = 47542, p-value = 0.05018
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -7.208015e-06 1.796896e-02
## sample estimates:
## cor
## 0.008981603
Those who did not pay off loans have slightly fewer trade lines that were not delinquent. There is not a clear enough trend with respect to lender yield to draw conclusions, at least on the basis of a linear correlation (\(r=0.009\), \(p=0.05\)).
First, I quickly want to see how well the numerical Prosper variables correlate, as well as whether they correlate with Prosper scores.
There is a moderately strong correlation between having a higher number of Prosper loans and having more on-time Prosper payments, unsurprisingly. There are also moderate to strong correlations among the variables related to friend recommendations and contributions, as would be expected.
TotalProsperLoansThose with more Prosper loans may be more reliable customers, but this is assuming that they have at least partially paid off those loans prior to opening the relevant loan. If there are many loans opened around the same time, this may suggest more financial difficulties.
CompletedNA values are replaced with 0’s.
table_stats(replace_na(lender_data, list(TotalProsperLoans = 0)), "Completed", "TotalProsperLoans")
PercentYieldtable_stats(replace_na(lender_data, list(TotalProsperLoans = 0)), "TotalProsperLoans", "PercentYield")
Those who paid off their loans have more Prosper loans, on average. Those who have never had any Prosper loans are significantly less profitable - for those who have taken out loans, there is not quite enough data to draw conclusions regarding whether taking out more, or fewer loans, corresponds to borrowers being more profitable.
OnTimeProsperPaymentsMore on-time Prosper payments, for those who have had prior loans, should correspond to a higher likelihood of repaying loans.
Completedtable_stats(lender_data, "Completed", "OnTimeProsperPayments")
PercentYieldcor.test(lender_data$OnTimeProsperPayments, lender_data$PercentYield, method = "pearson")
##
## Pearson's product-moment correlation
##
## data: lender_data$OnTimeProsperPayments and lender_data$PercentYield
## t = 11.16, df = 10537, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.08917715 0.12691575
## sample estimates:
## cor
## 0.1080854
Of those who have taken out Prosper loans previously, those who repaid their loans did have slightly more on-time Prosper payments, on average. Those who have more on-time payments also appear to be significantly more profitable (\(r=0.11\), \(p<0.001\)).
ProsperPrincipalOutstandingThe more Prosper principal outstanding at the time the listing was created, the lower the likelihood should be of repaying a loan.
Completedtable_stats(lender_data, "Completed", "ProsperPrincipalOutstanding")
PercentYieldcor.test(lender_data$ProsperPrincipalOutstanding, lender_data$PercentYield, method = "pearson")
##
## Pearson's product-moment correlation
##
## data: lender_data$ProsperPrincipalOutstanding and lender_data$PercentYield
## t = -16.345, df = 10537, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.1758119 -0.1385712
## sample estimates:
## cor
## -0.1572474
Those who did not pay off their loans have somewhat more outstanding principal at the time of loan creation, and the more principal outstanding, the less profitable they appear to be to lenders (\(r=-0.16\), \(p<0.001\)).
RecommendationsThe more recommendations, the more reliable the borrower should be.
Completedtable_stats(lender_data, "Completed", "Recommendations")
PercentYieldcor.test(lender_data$Recommendations, lender_data$PercentYield, method = "pearson")
##
## Pearson's product-moment correlation
##
## data: lender_data$Recommendations and lender_data$PercentYield
## t = 2.9083, df = 55082, p-value = 0.003636
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.004040256 0.020739638
## sample estimates:
## cor
## 0.01239081
There is no clear difference in number of recommendations between those who have and haven’t paid off loans, but more recommendations may correspond to marginally higher yield, although this seems to be a fairly weak trend, due in part to relatively few borrowers having any recommendations (\(r=0.01\), \(p<0.01\)).
InvestmentFromFriendsCountThose with more investment from friends may be regarded by friends as relatively trustworthy, and should be more likely to repay loans.
Completedtable_stats(lender_data, "Completed", "InvestmentFromFriendsCount")
PercentYieldcor.test(lender_data$InvestmentFromFriendsCount, lender_data$PercentYield, method = "pearson")
##
## Pearson's product-moment correlation
##
## data: lender_data$InvestmentFromFriendsCount and lender_data$PercentYield
## t = 4.9193, df = 55082, p-value = 8.712e-07
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.01260681 0.02930142
## sample estimates:
## cor
## 0.02095558
Those who paid off their loans had more friends invest. For lender yield, the trend is somewhat weak, also due to paucity of data, but those with at least some investment from friends appear to be a bit more profitable (\(r=0.02\), \(p<0.001\)).
InvestmentFromFriendsAmountCompletedtable_stats(lender_data, "Completed", "InvestmentFromFriendsAmount")
PercentYieldcor.test(lender_data$InvestmentFromFriendsAmount, lender_data$PercentYield, method = "pearson")
##
## Pearson's product-moment correlation
##
## data: lender_data$InvestmentFromFriendsAmount and lender_data$PercentYield
## t = 1.4999, df = 55082, p-value = 0.1336
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.001960388 0.014740876
## sample estimates:
## cor
## 0.006390689
There is no clear difference between those who have and haven’t paid off loans, in terms of investment from friends. For yield, there is no clear trend (\(r=0.006\), \(p=0.13\)).
LoanOriginationQuarterI have no predictions here, as I am not familiar with the history of the company.
Completedtable_stats(lender_data, "LoanOriginationQuarter", "Completed.num")
PercentYieldtable_stats(lender_data, "LoanOriginationQuarter", "PercentYield")
Interestingly, the percentage of loans paid off has fluctuated noticeably by quarter, and average lender yield had fluctuated very noticeably - there have been extended periods where average lender yield was well above 0, and other extended periods where it was well below 0.
Here, I will look at BorrowerAPR (lower is better), BorrowerRate (lower is better), LoanOriginalAmount (very roughly assuming that getting larger loans is somewhat preferable, though this is likely confounded by the borrower’s needs and financial situation), MonthlyLoanPayment (lower is generally better, although this may also be confounded by the borrower’s financial means, as well as the amount of the loan), Term (roughly assuming that longer is better, but this may also be confounded), and PercentFunded (more is better).
First, I want to see how well these variables correlate, as well as whether they correlate with Prosper ratings and estimates.
# re-defines relevant data to include open loans, but keeps PercentYield calculation, just in case, as well as credit score and credit history age calculations
borrower_data <- data %>%
mutate(PercentYield = (
(LoanOriginalAmount-LP_NetPrincipalLoss+LP_ServiceFees+LP_CollectionFees+LP_NonPrincipalRecoverypayments+LP_InterestandFees)
/LoanOriginalAmount)
-1) %>%
mutate(CreditScore = (CreditScoreRangeLower + CreditScoreRangeUpper)/2) %>%
mutate(CreditHistoryAge = as.numeric((ListingCreationDate - FirstRecordedCreditLine)/365))
In this case, there seem to be strong correlations among borrower terms, and between borrower terms and Prosper scores. What appears likely is that Prosper uses more-or-less the same data to determine scores and borrower terms, and the borrower terms they decide on for any given loan are unsurprisingly highly correlated.
I expect that borrowers with higher Prosper ratings will have better terms, as defined above.
Ratingtable_stats_mult(plot_data, "Measure", "Rating", "value")
ProsperScoretable_stats_mult(plot_data, "Measure", "ProsperScore", "value")
This appears to broadly be the case - borrowers with higher Prosper ratings, or higher Prosper scores, and have lower APRs and interest rates. However, they have higher monthly payments, likely due to higher loans. All groups have their loans about equally funded, and loan terms are broadly similar among all groups, even if they tend to be slightly longer for those with better risk scores. Term and PercentFunded therefore do not seem as informative, as they barely vary, and it will therefore be difficult to draw conclusions about them. I will therefore not include them in this preliminary exploration, although they may later turn out to be informative. MonthlyLoanPayment is confounded by the amount of the original loan, and I will therefore transform it to reflect the percentage of the loan paid per month.
borrower_data <- borrower_data %>%
mutate(MonthlyLoanPercent = MonthlyLoanPayment/LoanOriginalAmount)
EstimatedEffectiveYield# function to loop through relevant measures, printing out Pearson correlations with the variable of interest
complex_cor <- function(var) {
for (measure in c("BorrowerAPR","BorrowerRate","LoanOriginalAmount","MonthlyLoanPercent")) {
print(paste(var,"vs.",measure))
print(cor.test(borrower_data[[measure]], borrower_data[[var]], method = "pearson"))
}
}
complex_cor("EstimatedEffectiveYield")
## [1] "EstimatedEffectiveYield vs. BorrowerAPR"
##
## Pearson's product-moment correlation
##
## data: borrower_data[[measure]] and borrower_data[[var]]
## t = 586.55, df = 84851, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.8942956 0.8969580
## sample estimates:
## cor
## 0.8956348
##
## [1] "EstimatedEffectiveYield vs. BorrowerRate"
##
## Pearson's product-moment correlation
##
## data: borrower_data[[measure]] and borrower_data[[var]]
## t = 585.39, df = 84851, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.8939390 0.8966099
## sample estimates:
## cor
## 0.8952825
##
## [1] "EstimatedEffectiveYield vs. LoanOriginalAmount"
##
## Pearson's product-moment correlation
##
## data: borrower_data[[measure]] and borrower_data[[var]]
## t = -100.89, df = 84851, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.3332603 -0.3212446
## sample estimates:
## cor
## -0.3272657
##
## [1] "EstimatedEffectiveYield vs. MonthlyLoanPercent"
##
## Pearson's product-moment correlation
##
## data: borrower_data[[measure]] and borrower_data[[var]]
## t = 91.78, df = 84851, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.2943816 0.3066233
## sample estimates:
## cor
## 0.3005148
Here is seems that APR and interest rates are optimal when estimated effective yield is around 5%, and high at lower numbers (when yield is expected to be low), or higher numbers (where yield primarily comes from high interest rates on high-risk loans).
Loan amounts and monthly loan payments are more likely to be high when the yield is around 10%. Lower amounts for low-yield loans, and for high-yield but high-risk loans. Overall, looking at the gam model, which can fit non-linearities, optimal loans terms appear to correspond to an estimated yield rate of 5-10%.
On average, however, it looks like higher estimated yield corresponds to higher APR (\(r=0.90\), \(p<.001\)) and interest rates (\(r=0.90\), \(p<.001\)), lower original loan amounts (\(r=-0.33\), \(p<.001\)), and higher monthly loan payments (\(r=0.30\), \(p<.001\)).
EstimatedLosscomplex_cor("EstimatedLoss")
## [1] "EstimatedLoss vs. BorrowerAPR"
##
## Pearson's product-moment correlation
##
## data: borrower_data[[measure]] and borrower_data[[var]]
## t = 881.84, df = 84851, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.9488713 0.9501952
## sample estimates:
## cor
## 0.9495375
##
## [1] "EstimatedLoss vs. BorrowerRate"
##
## Pearson's product-moment correlation
##
## data: borrower_data[[measure]] and borrower_data[[var]]
## t = 844.11, df = 84851, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.9445764 0.9460085
## sample estimates:
## cor
## 0.945297
##
## [1] "EstimatedLoss vs. LoanOriginalAmount"
##
## Pearson's product-moment correlation
##
## data: borrower_data[[measure]] and borrower_data[[var]]
## t = -138.69, df = 84851, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.4353596 -0.4243895
## sample estimates:
## cor
## -0.4298904
##
## [1] "EstimatedLoss vs. MonthlyLoanPercent"
##
## Pearson's product-moment correlation
##
## data: borrower_data[[measure]] and borrower_data[[var]]
## t = 131.57, df = 84851, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.4060365 0.4172132
## sample estimates:
## cor
## 0.4116403
Here, it seems that for higher estimated loss, borrower ARP and interest rates sharply go up (then around 20%, seem to level off, due to some kind of cap). On the other hand, loan amounts and monthly payments are lower for loans with higher estimated loss.
On average, higher estimated loss corresponds to higher APR (\(r=0.95\), \(p<.001\)) and interest rates (\(r=0.95\), \(p<.001\)), lower loan amounts (\(r=-0.43\), \(p<.001\)), and higher monthly payments (\(r=0.42\), \(p<.001\)).
EstimatedReturncomplex_cor("EstimatedReturn")
## [1] "EstimatedReturn vs. BorrowerAPR"
##
## Pearson's product-moment correlation
##
## data: borrower_data[[measure]] and borrower_data[[var]]
## t = 380.81, df = 84851, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.7917782 0.7967457
## sample estimates:
## cor
## 0.7942752
##
## [1] "EstimatedReturn vs. BorrowerRate"
##
## Pearson's product-moment correlation
##
## data: borrower_data[[measure]] and borrower_data[[var]]
## t = 413.73, df = 84851, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.8154276 0.8198876
## sample estimates:
## cor
## 0.8176699
##
## [1] "EstimatedReturn vs. LoanOriginalAmount"
##
## Pearson's product-moment correlation
##
## data: borrower_data[[measure]] and borrower_data[[var]]
## t = -86.98, df = 84851, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.2922833 -0.2799279
## sample estimates:
## cor
## -0.2861175
##
## [1] "EstimatedReturn vs. MonthlyLoanPercent"
##
## Pearson's product-moment correlation
##
## data: borrower_data[[measure]] and borrower_data[[var]]
## t = 47.148, df = 84851, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.1532143 0.1663277
## sample estimates:
## cor
## 0.159778
Optimal APR and interest rates are for loans with around a 5% estimated return rate - likely for the same reasons as speculated above (regarding high-risk loans). Loan amounts and monthly loan payments peak around 5% estimated return rates.
On average, higher estimated return corresponds to higher APRs (\(r=0.79\), \(p<.001\)) and interest rates (\(r=0.82\), \(p<.001\)), lower loan amounts (\(r=-0.29\), \(p<.001\)), and higher relative monthly payments (\(r=0.16\), \(p<.001\)).
OccupationAgain, I expect higher-paying occupations will receive better terms, on average.
Overall, here it looks like higher-paying professions have higher approved loan amount and higher monthly payments, and moderately lower APR and interest rates, on average. Students, on average, have much smaller loans and lower monthly payments, and slightly higher APR and interest rates, on average.
IncomeRangeI am omitting the Not displayed category, as it’s unclear what to make of it.
table_stats_mult(plot_data, "Measure", "IncomeRange", "value")
Overall, those with higher income do receive more favorable terms (higher loan amounts, lower rates), together with slightly lower relative monthly loan payments. However, those with $0 income also receive relatively favorable terms, for reasons I can’t currently determine.
summary(filter(borrower_data, IncomeRange=="$0"))
## ListingKey ListingNumber ListingCreationDate
## 00033425227988088FA6752: 1 Min. :105697 Min. :2007-03-02
## 00693420495385812DA00DC: 1 1st Qu.:230408 1st Qu.:2007-11-12
## 00BC34179286853484DFB84: 1 Median :307845 Median :2008-04-11
## 01AB341657891640379960F: 1 Mean :300814 Mean :2008-05-14
## 02ED340355421846781EC6D: 1 3rd Qu.:340467 3rd Qu.:2008-05-28
## 02F6339443600285910FBF0: 1 Max. :887251 Max. :2013-09-03
## (Other) :615
## CreditGrade Term LoanStatus
## B :146 Min. :36.00 Completed :370
## AA :125 1st Qu.:36.00 Chargedoff :185
## A : 96 Median :36.00 Defaulted : 51
## C : 89 Mean :36.15 Current : 11
## D : 59 3rd Qu.:36.00 Past Due (1-15 days): 2
## (Other): 61 Max. :60.00 Past Due (>120 days): 1
## NA's : 45 (Other) : 1
## ClosedDate BorrowerAPR BorrowerRate LenderYield
## Min. :2007-08-31 Min. :0.01657 Min. :0.0050 Min. :-0.0050
## 1st Qu.:2009-03-10 1st Qu.:0.14960 1st Qu.:0.1400 1st Qu.: 0.1300
## Median :2010-02-16 Median :0.18726 Median :0.1750 Median : 0.1650
## Mean :2010-02-09 Mean :0.21035 Mean :0.1952 Mean : 0.1857
## 3rd Qu.:2011-01-02 3rd Qu.:0.26799 3rd Qu.:0.2500 3rd Qu.: 0.2400
## Max. :2014-02-13 Max. :0.39153 Max. :0.3500 Max. : 0.3400
## NA's :15
## EstimatedEffectiveYield EstimatedLoss EstimatedReturn
## Min. :-0.0166 Min. :0.0210 Min. :-0.0166
## 1st Qu.: 0.1399 1st Qu.:0.0990 1st Qu.: 0.1132
## Median : 0.2352 Median :0.1300 Median : 0.1243
## Mean : 0.2073 Mean :0.1281 Mean : 0.1157
## 3rd Qu.: 0.2848 3rd Qu.:0.1650 3rd Qu.: 0.1360
## Max. : 0.2957 Max. :0.2500 Max. : 0.1698
## NA's :576 NA's :576 NA's :576
## ProsperRating.num ProsperRating.alpha ProsperScore ListingCategory.num
## Min. :1.000 HR : 17 Min. :1.0 0 :183
## 1st Qu.:1.000 D : 15 1st Qu.:4.0 3 :157
## Median :2.000 E : 6 Median :5.0 4 :105
## Mean :2.422 B : 3 Mean :4.6 1 : 97
## 3rd Qu.:3.000 C : 2 3rd Qu.:6.0 7 : 31
## Max. :6.000 (Other): 2 Max. :9.0 5 : 28
## NA's :576 NA's :576 NA's :576 (Other): 20
## BorrowerState Occupation EmploymentStatus
## CA :103 Other :281 Full-time :290
## WA : 40 Sales - Retail : 36 Self-employed:257
## IL : 38 Professional : 35 Part-time : 46
## GA : 36 Sales - Commission: 34 Not employed : 11
## FL : 35 Construction : 26 Employed : 10
## (Other):351 Realtor : 20 Retired : 7
## NA's : 18 (Other) :189 (Other) : 0
## EmploymentStatusDuration IsBorrowerHomeowner CurrentlyInGroup
## Min. : 0.00 Mode :logical Mode :logical
## 1st Qu.: 10.00 FALSE:347 FALSE:477
## Median : 37.00 TRUE :274 TRUE :144
## Mean : 65.55
## 3rd Qu.: 87.00
## Max. :440.00
##
## GroupKey DateCreditPulled CreditScoreRangeLower
## 9BBE337094173775621CD34: 12 Min. :2007-03-01 Min. :520.0
## 3D4D3366260257624AB272D: 9 1st Qu.:2007-11-05 1st Qu.:640.0
## B7BA33816687475120A8407: 9 Median :2008-04-07 Median :680.0
## 783C3371218786870A73D20: 7 Mean :2008-05-08 Mean :686.2
## E37B3365357900652F8BF9E: 6 3rd Qu.:2008-05-21 3rd Qu.:740.0
## (Other) :100 Max. :2013-09-02 Max. :860.0
## NA's :478
## CreditScoreRangeUpper FirstRecordedCreditLine CurrentCreditLines
## Min. :539.0 Min. :1961-04-28 Min. : 0.00
## 1st Qu.:659.0 1st Qu.:1989-09-26 1st Qu.: 4.00
## Median :699.0 Median :1994-10-22 Median : 8.00
## Mean :705.2 Mean :1994-07-12 Mean : 9.09
## 3rd Qu.:759.0 3rd Qu.:2000-04-26 3rd Qu.:12.00
## Max. :879.0 Max. :2008-08-26 Max. :52.00
## NA's :3
## OpenCreditLines TotalCreditLinespast7years OpenRevolvingAccounts
## Min. : 0.00 Min. : 2.00 Min. : 0.000
## 1st Qu.: 4.00 1st Qu.: 11.00 1st Qu.: 3.000
## Median : 7.00 Median : 21.00 Median : 5.000
## Mean : 7.91 Mean : 22.97 Mean : 6.332
## 3rd Qu.:11.00 3rd Qu.: 31.00 3rd Qu.: 9.000
## Max. :48.00 Max. :101.00 Max. :32.000
## NA's :3
## OpenRevolvingMonthlyPayment InquiriesLast6Months TotalInquiries
## Min. : 0.0 Min. : 0.000 Min. : 0.000
## 1st Qu.: 50.0 1st Qu.: 0.000 1st Qu.: 2.000
## Median : 190.0 Median : 1.000 Median : 5.000
## Mean : 483.4 Mean : 2.307 Mean : 7.836
## 3rd Qu.: 607.0 3rd Qu.: 3.000 3rd Qu.:10.000
## Max. :6616.0 Max. :44.000 Max. :78.000
## NA's :3
## CurrentDelinquencies AmountDelinquent DelinquenciesLast7Years
## Min. : 0.0000 Min. : 0.0 Min. : 0.000
## 1st Qu.: 0.0000 1st Qu.: 0.0 1st Qu.: 0.000
## Median : 0.0000 Median : 0.0 Median : 0.000
## Mean : 0.5324 Mean : 472.9 Mean : 2.778
## 3rd Qu.: 0.0000 3rd Qu.: 0.0 3rd Qu.: 0.000
## Max. :20.0000 Max. :33134.0 Max. :99.000
## NA's :3 NA's :3 NA's :3
## PublicRecordsLast10Years PublicRecordsLast12Months RevolvingCreditBalance
## Min. : 0.0000 Min. :0.00000 Min. : 0
## 1st Qu.: 0.0000 1st Qu.:0.00000 1st Qu.: 978
## Median : 0.0000 Median :0.00000 Median : 5618
## Mean : 0.2589 Mean :0.02576 Mean : 27560
## 3rd Qu.: 0.0000 3rd Qu.:0.00000 3rd Qu.: 25702
## Max. :21.0000 Max. :2.00000 Max. :573692
## NA's :3
## BankcardUtilization AvailableBankcardCredit TotalTrades
## Min. :0.000 Min. : 0 Min. : 1.00
## 1st Qu.:0.110 1st Qu.: 533 1st Qu.: 9.00
## Median :0.560 Median : 4178 Median :17.00
## Mean :0.495 Mean : 15293 Mean :19.03
## 3rd Qu.:0.840 3rd Qu.: 14881 3rd Qu.:26.00
## Max. :1.890 Max. :646285 Max. :86.00
##
## TradesNeverDelinquent.per TradesOpenedLast6Months DebtToIncomeRatio
## Min. :0.0000 Min. : 0.0000 Min. : NA
## 1st Qu.:0.8000 1st Qu.: 0.0000 1st Qu.: NA
## Median :0.9500 Median : 0.0000 Median : NA
## Mean :0.8702 Mean : 0.8116 Mean :NaN
## 3rd Qu.:1.0000 3rd Qu.: 1.0000 3rd Qu.: NA
## Max. :1.0000 Max. :10.0000 Max. : NA
## NA's :621
## IncomeRange IncomeVerifiable StatedMonthlyIncome
## $0 :621 Mode :logical Min. :0.0000000
## Not displayed : 0 FALSE:598 1st Qu.:0.0000000
## Not employed : 0 TRUE :23 Median :0.0000000
## $1-24,999 : 0 Mean :0.0001342
## $25,000-49,999: 0 3rd Qu.:0.0000000
## $50,000-74,999: 0 Max. :0.0833330
## (Other) : 0
## LoanKey TotalProsperLoans
## 00013421083473792D70F75: 1 Min. :1.000
## 00C83418719708878D609C5: 1 1st Qu.:1.000
## 012233896012405328DE636: 1 Median :1.000
## 01713410652504452AED557: 1 Mean :1.506
## 023C3420631251102831DBE: 1 3rd Qu.:2.000
## 03113424359581104DFEFC8: 1 Max. :6.000
## (Other) :615 NA's :542
## TotalProsperPaymentsBilled OnTimeProsperPayments
## Min. : 0.00 Min. : 0.00
## 1st Qu.: 7.00 1st Qu.: 7.00
## Median :14.00 Median :14.00
## Mean :18.39 Mean :17.92
## 3rd Qu.:21.50 3rd Qu.:21.50
## Max. :81.00 Max. :74.00
## NA's :542 NA's :542
## ProsperPaymentsLessThanOneMonthLate ProsperPaymentsOneMonthPlusLate
## Min. :0.0000 Min. :0
## 1st Qu.:0.0000 1st Qu.:0
## Median :0.0000 Median :0
## Mean :0.4684 Mean :0
## 3rd Qu.:0.0000 3rd Qu.:0
## Max. :7.0000 Max. :0
## NA's :542 NA's :542
## ProsperPrincipalBorrowed ProsperPrincipalOutstanding
## Min. : 1000 Min. : 0.0
## 1st Qu.: 2512 1st Qu.: 0.0
## Median : 5000 Median : 941.8
## Mean : 8374 Mean : 2452.9
## 3rd Qu.:10000 3rd Qu.: 4068.3
## Max. :40000 Max. :13283.0
## NA's :542 NA's :542
## ScorexChangeAtTimeOfListing LoanCurrentDaysDelinquent
## Min. :-123.000 Min. : 0.0
## 1st Qu.: -26.000 1st Qu.: 0.0
## Median : 0.000 Median : 0.0
## Mean : 5.139 Mean : 557.9
## 3rd Qu.: 40.000 3rd Qu.:1413.0
## Max. : 102.000 Max. :2385.0
## NA's :542
## LoanFirstDefaultedCycleNumber LoanMonthsSinceOrigination LoanNumber
## Min. : 1.00 Min. : 6.0 Min. : 8871
## 1st Qu.:11.00 1st Qu.:69.0 1st Qu.: 21859
## Median :16.00 Median :71.0 Median : 29988
## Mean :17.26 Mean :69.7 Mean : 29389
## 3rd Qu.:22.00 3rd Qu.:76.0 3rd Qu.: 32393
## Max. :40.00 Max. :84.0 Max. :101132
## NA's :384
## LoanOriginalAmount LoanOriginationDate LoanOriginationQuarter
## Min. : 1000 Min. :2007-03-23 Q2 2008:258
## 1st Qu.: 2500 1st Qu.:2007-11-21 Q4 2007: 94
## Median : 5000 Median :2008-04-23 Q3 2007: 87
## Mean : 7411 Mean :2008-05-25 Q1 2008: 77
## 3rd Qu.:10000 3rd Qu.:2008-06-06 Q3 2008: 40
## Max. :25000 Max. :2013-09-10 Q2 2007: 13
## (Other): 52
## MemberKey MonthlyLoanPayment LP_CustomerPayments
## 4D9C3403302047712AD0CDD: 7 Min. : 0.00 Min. : 0
## 9BB9341743156002349B42B: 4 1st Qu.: 87.14 1st Qu.: 1627
## 2C513411833228870171AD8: 3 Median : 169.68 Median : 3922
## C5A83382914214204CCBFF9: 3 Mean : 267.47 Mean : 6226
## E4D7339769953308959BCC1: 3 3rd Qu.: 347.58 3rd Qu.: 7735
## EC443373755428167EAFB22: 3 Max. :1130.90 Max. :40548
## (Other) :598
## LP_CustomerPrincipalPayments LP_InterestandFees LP_ServiceFees
## Min. : 0 Min. : 0.0 Min. :-455.68
## 1st Qu.: 1093 1st Qu.: 362.7 1st Qu.: -82.85
## Median : 3000 Median : 841.8 Median : -40.51
## Mean : 4735 Mean : 1491.5 Mean : -65.32
## 3rd Qu.: 5542 3rd Qu.: 1839.7 3rd Qu.: -14.68
## Max. :25000 Max. :15547.7 Max. : 0.00
##
## LP_CollectionFees LP_GrossPrincipalLoss LP_NetPrincipalLoss
## Min. :-2755.56 Min. : 0 Min. : 0
## 1st Qu.: 0.00 1st Qu.: 0 1st Qu.: 0
## Median : 0.00 Median : 0 Median : 0
## Mean : -31.18 Mean : 2581 Mean : 2535
## 3rd Qu.: 0.00 3rd Qu.: 2612 3rd Qu.: 2551
## Max. : 0.00 Max. :24598 Max. :24598
##
## LP_NonPrincipalRecoverypayments PercentFunded Recommendations
## Min. : 0.00 Min. :0.7322 Min. : 0.0000
## 1st Qu.: 0.00 1st Qu.:1.0000 1st Qu.: 0.0000
## Median : 0.00 Median :1.0000 Median : 0.0000
## Mean : 77.76 Mean :0.9996 Mean : 0.4058
## 3rd Qu.: 0.00 3rd Qu.:1.0000 3rd Qu.: 0.0000
## Max. :6941.10 Max. :1.0000 Max. :39.0000
##
## InvestmentFromFriendsCount InvestmentFromFriendsAmount Investors
## Min. : 0.0000 Min. : 0.0 Min. : 1.0
## 1st Qu.: 0.0000 1st Qu.: 0.0 1st Qu.: 32.0
## Median : 0.0000 Median : 0.0 Median : 76.0
## Mean : 0.2899 Mean : 298.6 Mean :112.1
## 3rd Qu.: 0.0000 3rd Qu.: 0.0 3rd Qu.:153.0
## Max. :33.0000 Max. :23699.5 Max. :558.0
##
## Rating PercentYield CreditScore CreditHistoryAge
## B :149 Min. :-1.00000 Min. :529.5 Min. : 0.5507
## AA :125 1st Qu.:-0.42950 1st Qu.:649.5 1st Qu.: 8.0397
## A : 98 Median : 0.09404 Median :689.5 Median :13.9055
## C : 91 Mean :-0.05774 Mean :695.7 Mean :13.8511
## D : 74 3rd Qu.: 0.22531 3rd Qu.:749.5 3rd Qu.:18.7301
## HR : 46 Max. : 0.87279 Max. :869.5 Max. :47.1014
## (Other): 38 NA's :3
## MonthlyLoanPercent
## Min. :0.00000
## 1st Qu.:0.03369
## Median :0.03565
## Mean :0.03574
## 3rd Qu.:0.03923
## Max. :0.04524
##
It looks like most of those who report $0 income are in fact employed, which leads me to suspect that this measure is not correctly reported. I will therefore exclude this value from analysis.
table_stats_mult(plot_data, "Measure", "IncomeRange", "value")
EmploymentStatusI generally expect that those who have a more secure employment status will receive more favorable terms. I will filter out observations where employment status was not available.
table_stats_mult(plot_data, "Measure", "EmploymentStatus", "value")
The trend here is not quite as clear - overall, it looks like those employed receive somewhat better interest rates. Those who are ‘employed’ or ‘self-employed’ receive the highest loan amounts. Those who are not employed have marginally higher monthly loan payments.
EmploymentStatusDurationI expect that higher employment status duration may result in more favorable terms.
complex_cor("EmploymentStatusDuration")
## [1] "EmploymentStatusDuration vs. BorrowerAPR"
##
## Pearson's product-moment correlation
##
## data: borrower_data[[measure]] and borrower_data[[var]]
## t = -2.8004, df = 106310, p-value = 0.005104
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.01459900 -0.00257758
## sample estimates:
## cor
## -0.008588601
##
## [1] "EmploymentStatusDuration vs. BorrowerRate"
##
## Pearson's product-moment correlation
##
## data: borrower_data[[measure]] and borrower_data[[var]]
## t = -6.4922, df = 106310, p-value = 8.499e-11
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.02591549 -0.01389795
## sample estimates:
## cor
## -0.01990744
##
## [1] "EmploymentStatusDuration vs. LoanOriginalAmount"
##
## Pearson's product-moment correlation
##
## data: borrower_data[[measure]] and borrower_data[[var]]
## t = 32.157, df = 106310, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.09219259 0.10409908
## sample estimates:
## cor
## 0.09814935
##
## [1] "EmploymentStatusDuration vs. MonthlyLoanPercent"
##
## Pearson's product-moment correlation
##
## data: borrower_data[[measure]] and borrower_data[[var]]
## t = -15.161, df = 106310, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.05244613 -0.04044976
## sample estimates:
## cor
## -0.04644962
There appears to be a very slight but consistent trend for those with greater employment status duration to have higher loan amounts (\(r=0.10\), \(p<.001\)). The other trends are significant, but slight, and it is difficult to know what to make of them: APR (\(r=-0.009\), \(p<.01\)), interest (\(r=-0.02\), \(p<.001\)), monthly payments (\(r=-0.05\), \(p<.001\)).
IsBorrowerHomeownerI would expect homeowners to have slightly more favorable terms.
table_stats_mult(plot_data, "Measure", "IsBorrowerHomeowner", "value")
Overall, it seems that homeowners have very slightly lower APR and interest rates, higher loan amounts, and slightly lower proportional loan payments.
CreditScoreOverall, people with higher average credit scores should have markedly better terms.
complex_cor("CreditScore")
## [1] "CreditScore vs. BorrowerAPR"
##
## Pearson's product-moment correlation
##
## data: borrower_data[[measure]] and borrower_data[[var]]
## t = -160.21, df = 113340, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.4344422 -0.4249487
## sample estimates:
## cor
## -0.4297073
##
## [1] "CreditScore vs. BorrowerRate"
##
## Pearson's product-moment correlation
##
## data: borrower_data[[measure]] and borrower_data[[var]]
## t = -175.17, df = 113340, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.4661358 -0.4569730
## sample estimates:
## cor
## -0.4615667
##
## [1] "CreditScore vs. LoanOriginalAmount"
##
## Pearson's product-moment correlation
##
## data: borrower_data[[measure]] and borrower_data[[var]]
## t = 122.07, df = 113340, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.3357190 0.3460095
## sample estimates:
## cor
## 0.3408745
##
## [1] "CreditScore vs. MonthlyLoanPercent"
##
## Pearson's product-moment correlation
##
## data: borrower_data[[measure]] and borrower_data[[var]]
## t = -64.06, df = 113340, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.1925351 -0.1812987
## sample estimates:
## cor
## -0.186923
In fact, this is the case. Those with high credit scores have very markedly lower APR (\(r=-0.43\), \(p<.001\)) and interest rates (\(r=-0.46\), \(p<.001\)). On the other hand, they also have higher loan amounts (\(r=0.34\), \(p<.001\)) and slightly lower proportional loan payments (\(r=-0.19\), \(p<.001\)).
FirstRecordedCreditLineI expect there to be little if any trend here - but those with longer credit lines may receive more favorable terms, all things being equal.
complex_cor("CreditHistoryAge")
## [1] "CreditHistoryAge vs. BorrowerAPR"
##
## Pearson's product-moment correlation
##
## data: borrower_data[[measure]] and borrower_data[[var]]
## t = -9.6642, df = 113240, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.03452583 -0.02288669
## sample estimates:
## cor
## -0.02870723
##
## [1] "CreditHistoryAge vs. BorrowerRate"
##
## Pearson's product-moment correlation
##
## data: borrower_data[[measure]] and borrower_data[[var]]
## t = -17.8, df = 113240, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.05862916 -0.04701292
## sample estimates:
## cor
## -0.05282283
##
## [1] "CreditHistoryAge vs. LoanOriginalAmount"
##
## Pearson's product-moment correlation
##
## data: borrower_data[[measure]] and borrower_data[[var]]
## t = 56.666, df = 113240, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.1603877 0.1717152
## sample estimates:
## cor
## 0.1660569
##
## [1] "CreditHistoryAge vs. MonthlyLoanPercent"
##
## Pearson's product-moment correlation
##
## data: borrower_data[[measure]] and borrower_data[[var]]
## t = -21.487, df = 113240, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.06952240 -0.05792096
## sample estimates:
## cor
## -0.06372383
The trends here are difficult to interpret, although it looks like those with both very old, and more recent initial credit lines receive slightly less favorable terms (there is no clear trend, slight or otherwise, for monthly payments). Age of credit line is likely confounded with all sorts of other factors, however - from age, to employment status, to income.
On average, not taking into account any non-linear trends, those with longer credit history have slightly lower APRs (\(r=-0.03\), \(p<.001\)) and interest rates (\(r=-0.05\), \(p<.001\)), higher loan amounts (\(r=0.17\), \(p<.001\)), and slightly lower proportional monthly payments (\(r=-0.06\), \(p<.001\)).
OpenRevolvingAccountsI would expect for those with more open revolving accounts to have better borrower terms, as long as there are not too many revolving accounts open.
complex_cor("OpenRevolvingAccounts")
## [1] "OpenRevolvingAccounts vs. BorrowerAPR"
##
## Pearson's product-moment correlation
##
## data: borrower_data[[measure]] and borrower_data[[var]]
## t = -37.422, df = 113910, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.1159353 -0.1044620
## sample estimates:
## cor
## -0.1102023
##
## [1] "OpenRevolvingAccounts vs. BorrowerRate"
##
## Pearson's product-moment correlation
##
## data: borrower_data[[measure]] and borrower_data[[var]]
## t = -42.868, df = 113940, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.1316975 -0.1202688
## sample estimates:
## cor
## -0.1259873
##
## [1] "OpenRevolvingAccounts vs. LoanOriginalAmount"
##
## Pearson's product-moment correlation
##
## data: borrower_data[[measure]] and borrower_data[[var]]
## t = 80.92, df = 113940, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.2276295 0.2386114
## sample estimates:
## cor
## 0.2331279
##
## [1] "OpenRevolvingAccounts vs. MonthlyLoanPercent"
##
## Pearson's product-moment correlation
##
## data: borrower_data[[measure]] and borrower_data[[var]]
## t = -28.347, df = 113940, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.08944859 -0.07791686
## sample estimates:
## cor
## -0.08368553
In fact, it looks like more open revolving accounts result in lower APR (\(r=-0.11\), \(p<.001\)) and interest rates (\(r=-0.13\), \(p<.001\)), higher loan amounts (\(r=0.23\), \(p<.001\)), and lower proportional monthly payments (\(r=-0.08\), \(p<.001\)), in general. At higher values of revolving accounts the data is more sparse, and the trend not as clear, however.
InquiriesLast6MonthsI would expect for more inquiries to result in worse borrower terms.
complex_cor("InquiriesLast6Months")
## [1] "InquiriesLast6Months vs. BorrowerAPR"
##
## Pearson's product-moment correlation
##
## data: borrower_data[[measure]] and borrower_data[[var]]
## t = 49.704, df = 113240, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.1404144 0.1518145
## sample estimates:
## cor
## 0.1461193
##
## [1] "InquiriesLast6Months vs. BorrowerRate"
##
## Pearson's product-moment correlation
##
## data: borrower_data[[measure]] and borrower_data[[var]]
## t = 62.926, df = 113240, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.1781764 0.1894316
## sample estimates:
## cor
## 0.18381
##
## [1] "InquiriesLast6Months vs. LoanOriginalAmount"
##
## Pearson's product-moment correlation
##
## data: borrower_data[[measure]] and borrower_data[[var]]
## t = -34.804, df = 113240, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.10863762 -0.09711216
## sample estimates:
## cor
## -0.1028783
##
## [1] "InquiriesLast6Months vs. MonthlyLoanPercent"
##
## Pearson's product-moment correlation
##
## data: borrower_data[[measure]] and borrower_data[[var]]
## t = 37.396, df = 113240, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.1046922 0.1161989
## sample estimates:
## cor
## 0.1104492
The trends here are a bit unclear with the gam model - for loan amounts and monthly payments somewhat uninterpretable, but it seems that those with more inquiries do get higher APR and interest rates - although possibly simply because this is reflected in their credit scores.
Not taking into account possible non-linearities, however, those with more recent inquiries have higher APRs (\(r=0.14\), \(p<.001\)) and interest rates (\(r=0.18\), \(p<.001\)), lower loan amounts (\(r=-0.10\), \(p<.001\)), and higher proportional monthly payments (\(r=0.11\), \(p<.001\)).
AmountDelinquentThose with a higher amount delinquent should receive worse terms.
complex_cor("AmountDelinquent")
## [1] "AmountDelinquent vs. BorrowerAPR"
##
## Pearson's product-moment correlation
##
## data: borrower_data[[measure]] and borrower_data[[var]]
## t = 21.461, df = 106310, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.05969169 0.07166197
## sample estimates:
## cor
## 0.06567919
##
## [1] "AmountDelinquent vs. BorrowerRate"
##
## Pearson's product-moment correlation
##
## data: borrower_data[[measure]] and borrower_data[[var]]
## t = 21.45, df = 106310, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.05965715 0.07162748
## sample estimates:
## cor
## 0.06564467
##
## [1] "AmountDelinquent vs. LoanOriginalAmount"
##
## Pearson's product-moment correlation
##
## data: borrower_data[[measure]] and borrower_data[[var]]
## t = -12.646, df = 106310, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.04475624 -0.03275216
## sample estimates:
## cor
## -0.0387556
##
## [1] "AmountDelinquent vs. MonthlyLoanPercent"
##
## Pearson's product-moment correlation
##
## data: borrower_data[[measure]] and borrower_data[[var]]
## t = 12.857, df = 106310, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.03339668 0.04540016
## sample estimates:
## cor
## 0.03939984
Overall, what seems to make a difference is having any amount delinquent - those who have no amount delinquent receive better terms, and those who do receive slightly worse terms. Those terms do not seem to change, however, as the amount delinquent increases. At higher values, there is too little data to make clear conclusions about trends.
Not taking into account non-linearities in the data, those with a higher amount delinquent have higher APRs (\(r=0.07\), \(p<.001\)), higher interest rates (\(r=0.07\), \(p<.001\)), lower loan amounts (\(r=-0.04\), \(p<.001\)), and proportionately higher monthly payments (\(r=0.04\), \(p<.001\)).
DelinquenciesLast7YearsThose with more delinquencies should receive worse terms.
complex_cor("DelinquenciesLast7Years")
## [1] "DelinquenciesLast7Years vs. BorrowerAPR"
##
## Pearson's product-moment correlation
##
## data: borrower_data[[measure]] and borrower_data[[var]]
## t = 55.251, df = 112940, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.1565416 0.1678985
## sample estimates:
## cor
## 0.1622254
##
## [1] "DelinquenciesLast7Years vs. BorrowerRate"
##
## Pearson's product-moment correlation
##
## data: borrower_data[[measure]] and borrower_data[[var]]
## t = 58.074, df = 112940, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.1646102 0.1759359
## sample estimates:
## cor
## 0.1702787
##
## [1] "DelinquenciesLast7Years vs. LoanOriginalAmount"
##
## Pearson's product-moment correlation
##
## data: borrower_data[[measure]] and borrower_data[[var]]
## t = -46.365, df = 112940, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.1423852 -0.1309392
## sample estimates:
## cor
## -0.1366667
##
## [1] "DelinquenciesLast7Years vs. MonthlyLoanPercent"
##
## Pearson's product-moment correlation
##
## data: borrower_data[[measure]] and borrower_data[[var]]
## t = 24.87, df = 112940, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.06799669 0.07959701
## sample estimates:
## cor
## 0.07379935
The trends here are relatively weak, but it does seem that there is a general tendency for those with more delinquencies to have higher APR (\(r=0.16\), \(p<.001\)) and interest rates (\(r=0.17\), \(p<.001\)), somewhat lower loan amounts (\(r=-0.14\), \(p<.001\)), and proportionately higher monthly payments (\(r=0.07\), \(p<.001\)).
RevolvingCreditBalanceI do not have clear predictions here, as I’m not certain how revolving credit balance reflects on general credit worthiness. On the basis of previous plots, I would predict that those with more revolving credit balance would receive more favorable terms.
complex_cor("RevolvingCreditBalance")
## [1] "RevolvingCreditBalance vs. BorrowerAPR"
##
## Pearson's product-moment correlation
##
## data: borrower_data[[measure]] and borrower_data[[var]]
## t = -19.122, df = 106330, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.06452731 -0.05254738
## sample estimates:
## cor
## -0.05853945
##
## [1] "RevolvingCreditBalance vs. BorrowerRate"
##
## Pearson's product-moment correlation
##
## data: borrower_data[[measure]] and borrower_data[[var]]
## t = -19.472, df = 106330, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.06559529 -0.05361688
## sample estimates:
## cor
## -0.05960823
##
## [1] "RevolvingCreditBalance vs. LoanOriginalAmount"
##
## Pearson's product-moment correlation
##
## data: borrower_data[[measure]] and borrower_data[[var]]
## t = 63.408, df = 106330, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.1850804 0.1966636
## sample estimates:
## cor
## 0.1908787
##
## [1] "RevolvingCreditBalance vs. MonthlyLoanPercent"
##
## Pearson's product-moment correlation
##
## data: borrower_data[[measure]] and borrower_data[[var]]
## t = -11.963, df = 106330, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.04266217 -0.03065721
## sample estimates:
## cor
## -0.03666101
Overall, it looks like those with more credit balance have lower APR (\(r=-0.05\), \(p<.001\)) and interest rates (\(r=-0.06\), \(p<.001\)), higher loan amounts (\(r=0.19\), \(p<.001\)), and lower proportional monthly payments (\(r=-0.04\), \(p<.001\)). Those with revolving credit balance have sharply, then steadily decreasing values (in the case of APR, interest rates, and monthly payments), or increasing values (in the case of loan amounts).
BankcardUtilizationThose with higher bankcard utilization might be expected to have worse terms at higher values, if higher bankcard utilization corresponds to lower financial means. On the other hand, very low utilization may indicate low credit trustworthiness.
complex_cor("BankcardUtilization")
## [1] "BankcardUtilization vs. BorrowerAPR"
##
## Pearson's product-moment correlation
##
## data: borrower_data[[measure]] and borrower_data[[var]]
## t = 88.323, df = 106330, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.2558295 0.2670290
## sample estimates:
## cor
## 0.261438
##
## [1] "BankcardUtilization vs. BorrowerRate"
##
## Pearson's product-moment correlation
##
## data: borrower_data[[measure]] and borrower_data[[var]]
## t = 86.168, df = 106330, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.2498551 0.2610917
## sample estimates:
## cor
## 0.255482
##
## [1] "BankcardUtilization vs. LoanOriginalAmount"
##
## Pearson's product-moment correlation
##
## data: borrower_data[[measure]] and borrower_data[[var]]
## t = -11.088, df = 106330, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.03998678 -0.02797954
## sample estimates:
## cor
## -0.03398438
##
## [1] "BankcardUtilization vs. MonthlyLoanPercent"
##
## Pearson's product-moment correlation
##
## data: borrower_data[[measure]] and borrower_data[[var]]
## t = 18.621, df = 106330, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.05101798 0.06300003
## sample estimates:
## cor
## 0.05701106
In fact, higher bankcard utilization corresponds to very initially lowered, and then sharply increased APR and interest rates. Similarly, it corresponds to initially increased, and then decreased loan amount and monthly payments.
Overall, those with more bankcard utilization have higher APR (\(r=0.26\), \(p<.001\)), interest rates (\(r=0.26\), \(p<.001\)), and proportional monthly payments (\(r=0.06\), \(p<.001\)), and lower loan amounts (\(r=-0.03\), \(p<.001\)).
DebtToIncomeRatioA higher debt to income ratio should correspond to worse terms across the board.
complex_cor("DebtToIncomeRatio")
## [1] "DebtToIncomeRatio vs. BorrowerAPR"
##
## Pearson's product-moment correlation
##
## data: borrower_data[[measure]] and borrower_data[[var]]
## t = 18.312, df = 105360, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.05030622 0.06234452
## sample estimates:
## cor
## 0.05632742
##
## [1] "DebtToIncomeRatio vs. BorrowerRate"
##
## Pearson's product-moment correlation
##
## data: borrower_data[[measure]] and borrower_data[[var]]
## t = 20.465, df = 105380, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.05690080 0.06892819
## sample estimates:
## cor
## 0.06291678
##
## [1] "DebtToIncomeRatio vs. LoanOriginalAmount"
##
## Pearson's product-moment correlation
##
## data: borrower_data[[measure]] and borrower_data[[var]]
## t = 3.2828, df = 105380, p-value = 0.001028
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.004074882 0.016148830
## sample estimates:
## cor
## 0.01011222
##
## [1] "DebtToIncomeRatio vs. MonthlyLoanPercent"
##
## Pearson's product-moment correlation
##
## data: borrower_data[[measure]] and borrower_data[[var]]
## t = 8.6912, df = 105380, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.02072922 0.03279575
## sample estimates:
## cor
## 0.02676346
Higher debt to income ratios correspond generally to higher APRs, interest rates, and monthly payments, and to initially higher, and then gradually lower loan amounts. Overall, those with a higher ration have higher APRs (\(r=0.06\), \(p<.001\)) and interest rates (\(r=0.06\), \(p<.001\)), marginally higher loan amounts (\(r=0.01\), \(p<.01\)), and marginally higher monthly loan payments (\(r=0.03\), \(p<.001\)).
IncomeVerifiableVerifiable income should lead to at least somewhat more favorable terms.
table_stats_mult(plot_data, "Measure", "IncomeVerifiable", "value")
Those whose income is verifiable have lower APR and interest rates, and tend towards higher loan amounts and slightly lower proportional monthly payments.
TotalTradesThose with more trade lines opened in the past might, at the least, have more financial means, which may correspond to more favorable terms.
complex_cor("TotalTrades")
## [1] "TotalTrades vs. BorrowerAPR"
##
## Pearson's product-moment correlation
##
## data: borrower_data[[measure]] and borrower_data[[var]]
## t = -13.677, df = 106390, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.04789068 -0.03589405
## sample estimates:
## cor
## -0.04189387
##
## [1] "TotalTrades vs. BorrowerRate"
##
## Pearson's product-moment correlation
##
## data: borrower_data[[measure]] and borrower_data[[var]]
## t = -15.743, df = 106390, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.05420385 -0.04221405
## sample estimates:
## cor
## -0.04821068
##
## [1] "TotalTrades vs. LoanOriginalAmount"
##
## Pearson's product-moment correlation
##
## data: borrower_data[[measure]] and borrower_data[[var]]
## t = 59.734, df = 106390, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.1743179 0.1859457
## sample estimates:
## cor
## 0.1801381
##
## [1] "TotalTrades vs. MonthlyLoanPercent"
##
## Pearson's product-moment correlation
##
## data: borrower_data[[measure]] and borrower_data[[var]]
## t = -18.326, df = 106390, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.06208250 -0.05010258
## sample estimates:
## cor
## -0.05609456
Overall, it looks like those with more trades have lower APR (\(r=-0.04\), \(p<.001\)) and interest rates (\(r=-0.05\), \(p<.001\)), higher loan amounts (\(r=0.18\), \(p<.001\)), and slightly lower monthly loan payments (\(r=-0.06\), \(p<.001\)), although this appears to even out quickly at higher amounts, rather than rising steadily.
TradesNeverDelinquentFewer delinquencies should correspond to more favorable terms.
complex_cor("TradesNeverDelinquent.per")
## [1] "TradesNeverDelinquent.per vs. BorrowerAPR"
##
## Pearson's product-moment correlation
##
## data: borrower_data[[measure]] and borrower_data[[var]]
## t = -81.12, df = 106390, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.2469995 -0.2356818
## sample estimates:
## cor
## -0.2413489
##
## [1] "TradesNeverDelinquent.per vs. BorrowerRate"
##
## Pearson's product-moment correlation
##
## data: borrower_data[[measure]] and borrower_data[[var]]
## t = -88.257, df = 106390, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.2667796 -0.2555817
## sample estimates:
## cor
## -0.2611895
##
## [1] "TradesNeverDelinquent.per vs. LoanOriginalAmount"
##
## Pearson's product-moment correlation
##
## data: borrower_data[[measure]] and borrower_data[[var]]
## t = 85.13, df = 106390, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.2469002 0.2581515
## sample estimates:
## cor
## 0.2525344
##
## [1] "TradesNeverDelinquent.per vs. MonthlyLoanPercent"
##
## Pearson's product-moment correlation
##
## data: borrower_data[[measure]] and borrower_data[[var]]
## t = -41.614, df = 106390, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.1324633 -0.1206381
## sample estimates:
## cor
## -0.1265552
There is a very clear trend here - those whose trades were never, or rarely delinquent have lower APR (\(r=-0.25\), \(p<.001\)) and interest rates (\(r=-0.26\), \(p<.001\)), higher original loan amounts (\(r=0.25\), \(p<.001\)), and lower proportional monthly payments (\(r=-0.13\), \(p<.001\)).
TotalProsperLoansThose who have more existing Prosper loans may receive more favorable terms.
table_stats_mult(plot_data, "Measure", "TotalProsperLoans", "value")
There is only a slight and unstable trend here - those with more Prosper loans may receive slightly lower APR and interest rates.
OnTimeProsperPaymentsThose with more on-time Prosper payments should receive more favorable terms.
complex_cor("OnTimeProsperPayments")
## [1] "OnTimeProsperPayments vs. BorrowerAPR"
##
## Pearson's product-moment correlation
##
## data: borrower_data[[measure]] and borrower_data[[var]]
## t = 2.7693, df = 22083, p-value = 0.005622
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.005444999 0.031813347
## sample estimates:
## cor
## 0.01863241
##
## [1] "OnTimeProsperPayments vs. BorrowerRate"
##
## Pearson's product-moment correlation
##
## data: borrower_data[[measure]] and borrower_data[[var]]
## t = 1.0396, df = 22083, p-value = 0.2985
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.006193689 0.020182524
## sample estimates:
## cor
## 0.006995634
##
## [1] "OnTimeProsperPayments vs. LoanOriginalAmount"
##
## Pearson's product-moment correlation
##
## data: borrower_data[[measure]] and borrower_data[[var]]
## t = 5.2269, df = 22083, p-value = 1.74e-07
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.02197311 0.04831802
## sample estimates:
## cor
## 0.03515167
##
## [1] "OnTimeProsperPayments vs. MonthlyLoanPercent"
##
## Pearson's product-moment correlation
##
## data: borrower_data[[measure]] and borrower_data[[var]]
## t = 0.064626, df = 22083, p-value = 0.9485
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.01275394 0.01362356
## sample estimates:
## cor
## 0.0004348871
In fact, this is not at all the case - those with more on-time Prosper payments tend to have marginally higher APRs (\(r=0.02\), \(p<.01\)) and loan amounts (\(r=0.04\), \(p<.001\)), with little to no effect on interest rates (\(r=0.007\), \(p=0.30\)) and monthly payments (\(r=0.0004\), \(p=0.95\)). There appears to be no clear trend here, and my suspicion is that other factors which lead people to take out more Prosper loans may confound this.
ProsperPrincipalOutstandingThose with more principal outstanding should have less favorable terms.
complex_cor("ProsperPrincipalOutstanding")
## [1] "ProsperPrincipalOutstanding vs. BorrowerAPR"
##
## Pearson's product-moment correlation
##
## data: borrower_data[[measure]] and borrower_data[[var]]
## t = -13.481, df = 22083, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.10341298 -0.07725075
## sample estimates:
## cor
## -0.09034745
##
## [1] "ProsperPrincipalOutstanding vs. BorrowerRate"
##
## Pearson's product-moment correlation
##
## data: borrower_data[[measure]] and borrower_data[[var]]
## t = -12.78, df = 22083, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.09876183 -0.07257795
## sample estimates:
## cor
## -0.08568468
##
## [1] "ProsperPrincipalOutstanding vs. LoanOriginalAmount"
##
## Pearson's product-moment correlation
##
## data: borrower_data[[measure]] and borrower_data[[var]]
## t = 31.305, df = 22083, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.1934751 0.2187319
## sample estimates:
## cor
## 0.2061378
##
## [1] "ProsperPrincipalOutstanding vs. MonthlyLoanPercent"
##
## Pearson's product-moment correlation
##
## data: borrower_data[[measure]] and borrower_data[[var]]
## t = -18.356, df = 22083, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.1355594 -0.1095782
## sample estimates:
## cor
## -0.1225898
Barring possible non-linearities, this confusingly does not appear to be the case - those with more principal outstanding have lower APRs (\(r=-0.09\), \(p<.001\)), interest rates (\(r=-0.09\), \(p<.001\)), and monthly payments (\(r=-0.12\), \(p<.001\)), and higher loan amounts (\(r=0.21\), \(p<.001\)). There is likely another confounding factor here, or I am missing something conceptually.
RecommendationsThose with more recommendations may receive more favorable terms, assuming the terms are assigned after the recommendations are posted.
complex_cor("Recommendations")
## [1] "Recommendations vs. BorrowerAPR"
##
## Pearson's product-moment correlation
##
## data: borrower_data[[measure]] and borrower_data[[var]]
## t = -14.945, df = 113910, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.05003080 -0.03843919
## sample estimates:
## cor
## -0.04423648
##
## [1] "Recommendations vs. BorrowerRate"
##
## Pearson's product-moment correlation
##
## data: borrower_data[[measure]] and borrower_data[[var]]
## t = -10.326, df = 113940, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.03637876 -0.02477656
## sample estimates:
## cor
## -0.03057869
##
## [1] "Recommendations vs. LoanOriginalAmount"
##
## Pearson's product-moment correlation
##
## data: borrower_data[[measure]] and borrower_data[[var]]
## t = -6.1409, df = 113940, p-value = 8.232e-10
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.02399398 -0.01238477
## sample estimates:
## cor
## -0.01818999
##
## [1] "Recommendations vs. MonthlyLoanPercent"
##
## Pearson's product-moment correlation
##
## data: borrower_data[[measure]] and borrower_data[[var]]
## t = 2.8548, df = 113940, p-value = 0.004308
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.002650773 0.014262999
## sample estimates:
## cor
## 0.008457171
The data here is quite noisy, due the paucity of recommendations on loans, but it seems that overall, APR (\(r=-0.04\), \(p<.001\)) and interest rates (\(r=-0.03\), \(p<.001\)) are lower for those with more recommendations, loan amounts are lower (\(r=-0.02\), \(p<.001\)), and monthly payments are marginally higher (\(r=0.008\), \(p<.01\)). This is not necessarily a causal effect, however, as there are likely to be additional confounding factors, and there is too little data in any case to draw firm conclusions.
LoanOriginationQuarterI have no clear predictions, again, not knowing the company history, about what average terms might look like by quarter.
table_stats_mult(plot_data, "Measure", "LoanOriginationQuarter", "value")
Here it looks like APR and borrower interest rates have been fairly stable over the years, although increasing slightly and the again decreasing over time. The average loan amount and (correspondingly) monthly loan payment has fluctuated somewhat more, possibly in connection with the economy, Prosper finances, and/or the general popularity of Prosper with potential customers of various demographics.
What to me is most notable above is that there is a consistent relationship between potential lender yield, and potential or actual lender loss: the more the lender stands to gain, the riskier the loan, and the more they stand to lose. Furthermore, it seems to be that Prosper estimates of risk or loan quality are underconservative, and that except in the case of the highest-rated loans, lenders on average tend to lose money. I look at this in more detail below.
Here, I show that the ratings Prosper assigns to its loans, when they are posted, are reasonably informative predictors of the intermediate or final status of the loans. Loans with the highest Prosper ratings are rarely defaulted on or charged off, whereas those in the lower Prosper ratings are frequently, or even more often than not, defaulted on or charged off. This suggests that lenders are well-advised to take the ratings assigned to loans by Prosper into account, when deciding which loans to invest in.
This plot suggests that Prosper, likely either on the basis of faulty prediction, and/or in an effort to entice lenders to sign up with the service, appears to drastically overestimate how much lenders are likely to profit from a loan.
As can be seen above in the first row, even after correcting for potential loss on charge-offs, Prosper suggests that lender return will on average be positive for all loans, including (or especially) those with low Prosper scores, who take out high-interest loans. However, as can be seen from the actual data in the second row, the range of return in all categories is far closer to 0, frequently falling under (boxplots on the left) - and in fact lenders, on average, profit only from the highest-rated loans (whereas Prosper suggests that their return will be less than that of lower-rated loans).
This data suggest that prospective lenders should treat all estimates of profit as underconservative, and that although they stand to possibly gain more through investing in higher-interest loans, on average they will in fact profit only from investing in very low-risk loans.
Although I can’t explain the cause of the trends seen here without knowing more about the company history, it is clear that there are lengthy periods where up to about 50% of closed loans are charged off or defaulted on, and correspondingly lengthy periods where lenders, on average, lose a significant amount of money by investing. This suggests that while there may be periods when it is generally profitable for lenders to invest in Prosper loans, overall it is a relatively risky venture, and there are periods when lenders will on average lose a lot of money. This may however in part be due to lenders specifically seeking out high-risk and high-interest loans, even if they are relatively unlikely to pay off, ultimately.
First, I encountered some amount of trouble interpreting the data without any background story. Googling around for information on Prosper loans online, I was able to get a general idea of what the company was doing, which made interpreting the data significantly easier.
At this point, it is still difficult to say much about this data without having more information about how the data was collected, why some data is missing, how certain measures were assigned, and how lender profit and borrower terms reflect on the profit of the company. One would need to take a much closer and more in-depth look at what the company does, what purpose the data serves, how the measures were collected, how they are used, and what they are meant to reflect.
Starting with this data set, in the future it may be possible to build a model that is more predictive of lender yield than current measures provided by Prosper. This would likely be very useful for lenders choosing whether to invest in certain loans, or not. What would also be interesting is separately gathering data on which factors are most decisive in convincing lenders to invest in certain loans.